How to install Python packages without a requirements.txt file with pipreqs

How to install Python packages without a requirements.txt file with pipreqs

Why bother

Say you get a project that doesn't have a requriements file in it and that project has 20+ imports, meaning you need to install 20+ modules manually. Sound not interesting, right?

That's when pipreqs comes into play as a "life saver". This tool will scan all scripts/folders in the current working directory (or where you want it to look at by providing a path) and installs all the found packages.

Example usage

pipreqs-usage-gif-example

Solution

  • create virtual env
  • actiavte it
  • install pipreqs
  • tell pipreqs to look for files in current folder "./" and use --encoding utf-8
    • wait until requirements.txt is created
  • install scirpt dependencies from created requirements.txt

Which results into this commands:

# windows
python -m venv env && \
source env/Scripts/activate && \
pip install pipreqs && \
pipreqs --encoding utf-8 "./" && \
pip install -r requirements.txt && \
pip freeze > requirements.txt
# linux
python -m venv env && \
source env/source/activate && \
pip install pipreqs && \
pipreqs --encoding utf-8 "./" && \
pip install -r requirements.txt && \
pip freeze > requirements.txt

Low amount of imports example

Let's say you have script like this:

import requests

response = requests.get('https://serpapi.com/playground')
print(response.html)

Big amount of imports example

The point of it is to show how all the modules installs automatically without having initial requirements.txt file.

Here will have a bigger amout imports:

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import plotly.express as px
import scipy.stats as stats
import statsmodels.api as sm
import sklearn
import yellowbrick
import wordcloud
import nltk
import spacy
import transformers
import streamlit as st

# Load and clean data
data = pd.read_csv('data.csv')
data.dropna(inplace=True)

# Descriptive statistics
print('Data Summary')
print(data.describe())

# Data visualization
sns.histplot(data['age'], kde=False, bins=10)
plt.title('Age Distribution')
plt.show()

px.scatter(data, x='income', y='age', color='gender', title='Income vs. Age')

# Correlation analysis
corr_matrix = data.corr()
sns.heatmap(corr_matrix, annot=True, cmap='coolwarm')
plt.title('Correlation Matrix')
plt.show()

# Statistical analysis
stat, p = stats.ttest_ind(data[data['gender']=='M']['income'], data[data['gender']=='F']['income'])
print(f'T-test: statistic={stat}, pvalue={p}')

# Machine learning
X = data[['age', 'income']]
y = data['gender']
model = sklearn.linear_model.LogisticRegression()
model.fit(X, y)
visualizer = yellowbrick.classifier.classification_report(model, X, y)
visualizer.show()

# Text analysis
text = 'This is a sample text for text analysis'
tokens = nltk.word_tokenize(text)
print(f'Tokenized text: {tokens}')

nlp = spacy.load('en_core_web_sm')
doc = nlp(text)
for token in doc:
    print(token.text, token.pos_)

model = transformers.pipeline('sentiment-analysis')
result = model(text)[0]
print(f'Sentiment analysis: {result["label"]}, score={result["score"]}')

# Streamlit app
st.title('Data Analysis App')
st.write('Data Summary')
st.write(data.describe())

Generated requirements.txt afterwards:

altair==4.2.2
attrs==23.1.0
blinker==1.6.2
blis==0.7.9
cachetools==5.3.0
catalogue==2.0.8
certifi==2022.12.7
charset-normalizer==3.1.0
click==8.1.3
colorama==0.4.6
confection==0.0.4
contourpy==1.0.7
cssselect==1.2.0
cycler==0.11.0
cymem==2.0.7
decorator==5.1.1
docopt==0.6.2
entrypoints==0.4
filelock==3.12.0
fonttools==4.39.3
fsspec==2023.4.0
gitdb==4.0.10
GitPython==3.1.31
huggingface-hub==0.14.1
idna==3.4
importlib-metadata==6.6.0
Jinja2==3.1.2
jmespath==1.0.1
joblib==1.2.0
jsonschema==4.17.3
kiwisolver==1.4.4
langcodes==3.3.0
lxml==4.9.2
markdown-it-py==2.2.0
MarkupSafe==2.1.2
matplotlib==3.7.1
mdurl==0.1.2
murmurhash==1.0.9
nltk==3.8.1
numpy==1.24.3
packaging==23.1
pandas==2.0.1
parsel==1.8.1
pathy==0.10.1
patsy==0.5.3
Pillow==9.5.0
pipreqs==0.4.13
plotly==5.14.1
preshed==3.0.8
protobuf==3.20.3
pyarrow==12.0.0
pydantic==1.10.7
pydeck==0.8.1b0
Pygments==2.15.1
Pympler==1.0.1
pyparsing==3.0.9
pyrsistent==0.19.3
python-dateutil==2.8.2
pytz==2023.3
pytz-deprecation-shim==0.1.0.post0
PyYAML==6.0
regex==2023.5.4
requests==2.29.0
rich==13.3.5
scikit-learn==1.2.2
scipy==1.10.1
seaborn==0.12.2
six==1.16.0
smart-open==6.3.0
smmap==5.0.0
spacy==3.5.2
spacy-legacy==3.0.12
spacy-loggers==1.0.4
srsly==2.4.6
statsmodels==0.13.5
streamlit==1.22.0
tenacity==8.2.2
thinc==8.1.10
threadpoolctl==3.1.0
tokenizers==0.13.3
toml==0.10.2
toolz==0.12.0
tornado==6.3.1
tqdm==4.65.0
transformers==4.28.1
typer==0.7.0
typing_extensions==4.5.0
tzdata==2023.3
tzlocal==4.3
urllib3==1.26.15
validators==0.20.0
w3lib==2.1.1
wasabi==1.1.1
watchdog==3.0.0
wordcloud==1.9.1.1
yarg==0.1.9
yellowbrick==1.5
zipp==3.15.0

Current limitaions

The only drawback for now, is that it doesn't recognize all packages as there're ~10+ related issues when pipreqs didn't recognize a package.