I want to have two src dirs at the root of my project. The reason is that one is code I want to work without modifying any of the imports. The second is new code indepdent of the "old code". I want two src's with and pip install -e . to work. My setup.py is:
"""
python -c "print()"
refs:
- setup tools: https://setuptools.pypa.io/en/latest/userguide/package_discovery.html#using-find-or-find-packages
- https://stackoverflow.com/questions/70295885/how-does-one-install-pytorch-and-related-tools-from-within-the-setup-py-install
"""
from setuptools import setup
from setuptools import find_packages
import os
here = os.path.abspath(os.path.dirname(__file__))
with open(os.path.join(here, 'README.md'), encoding='utf-8') as f:
long_description = f.read()
setup(
name='massive-evaporate-4-math', # project name
version='0.0.1',
long_description=long_description,
long_description_content_type="text/markdown",
author='Me',
author_email='[email protected]',
python_requires='>=3.9',
license='Apache 2.0',
# ref: https://chat.openai.com/c/d0edae00-0eb2-4837-b492-df1d595b6cab
# The `package_dir` parameter is a dictionary that maps package names to directories.
# A key of an empty string represents the root package, and its corresponding value
# is the directory containing the root package. Here, the root package is set to the
# 'src' directory.
#
# The use of an empty string `''` as a key is significant. In the context of setuptools,
# an empty string `''` denotes the root package of the project. It means that the
# packages and modules located in the specified directory ('src' in this case) are
# considered to be in the root of the package hierarchy. This is crucial for correctly
# resolving package and module imports when the project is installed.
#
# By specifying `{'': 'src'}`, we are informing setuptools that the 'src' directory is
# the location of the root package, and it should look in this directory to find the
# Python packages and modules to be included in the distribution.
package_dir={
'': 'src_math_evaporate',
'bm_evaporate': 'src_bm_evaporate',
},
# The `packages` parameter lists all Python packages that should be included in the
# distribution. A Python package is a way of organizing related Python modules into a
# directory hierarchy. Any directory containing an __init__.py file is considered a
# Python package.
#
# `find_packages('src')` is a convenience function provided by setuptools, which
# automatically discovers and lists all packages in the specified 'src' directory.
# This means it will include all directories in 'src' that contain an __init__.py file,
# treating them as Python packages to be included in the distribution.
#
# By using `find_packages('src')`, we ensure that all valid Python packages inside the
# 'src' directory, regardless of their depth in the directory hierarchy, are included
# in the distribution, eliminating the need to manually list them. This is particularly
# useful for projects with a large number of packages and subpackages, as it reduces
# the risk of omitting packages from the distribution.
packages=find_packages('src_math_evaporate') + find_packages('src_bm_evaporate'),
# When using `pip install -e .`, the package is installed in 'editable' or 'develop' mode.
# This means that changes to the source files immediately affect the installed package
# without requiring a reinstall. This is extremely useful during development as it allows
# for testing and iteration without the constant need for reinstallation.
#
# In 'editable' mode, the correct resolution of package and module locations is crucial.
# The `package_dir` and `packages` configurations play a vital role in this. If the
# `package_dir` is incorrectly set, or if a package is omitted from the `packages` list,
# it can lead to ImportError due to Python not being able to locate the packages and
# modules correctly.
#
# Therefore, when using `pip install -e .`, it is essential to ensure that `package_dir`
# correctly maps to the root of the package hierarchy and that `packages` includes all
# the necessary packages by using `find_packages`, especially when the project has a
# complex structure with nested packages. This ensures that the Python interpreter can
# correctly resolve imports and locate the source files, allowing for a smooth and
# efficient development workflow.
# for pytorch see doc string at the top of file
install_requires=[
'fire',
'dill',
'networkx>=2.5',
'scipy',
'scikit-learn',
'lark-parser',
'tensorboard',
'pandas',
'progressbar2',
'requests',
'aiohttp',
'numpy',
'plotly',
'wandb',
'matplotlib',
# 'statsmodels'
# 'statsmodels==0.12.2'
# 'statsmodels==0.13.5'
# - later check why we are not installing it...
# 'seaborn'
# 'nltk'
'twine',
# # mercury: https://github.com/vllm-project/vllm/issues/2747
# 'dspy-ai',
# # 'torch==2.1.2+cu118', # 2.2 net supported due to vllm see: https://github.com/vllm-project/vllm/issues/2747
# 'torch==2.2.2', # 2.2 net supported due to vllm see: https://github.com/vllm-project/vllm/issues/2747
# # 'torchvision',
# # 'torchaudio',
# # 'trl',
# 'transformers',
# 'accelerate',
# # 'peft',
# # 'datasets==2.18.0',
# 'datasets',
# 'evaluate',
# 'bitsandbytes',
# # 'einops',
# # 'vllm==0.4.0.post1', # my gold-ai-olympiad project uses 0.4.0.post1 ref: https://github.com/vllm-project/vllm/issues/2747
# ampere
'dspy-ai',
# 'torch==2.1.2+cu118', # 2.2 net supported due to vllm see: https://github.com/vllm-project/vllm/issues/2747
'torch==2.1.2', # 2.2 net supported due to vllm see: https://github.com/vllm-project/vllm/issues/2747
# 'torchvision',
# 'torchaudio',
# 'trl',
'transformers==4.39.2',
'accelerate==0.29.2',
# 'peft',
# 'datasets==2.18.0',
'datasets==2.14.7',
'evaluate==0.4.1',
'bitsandbytes== 0.43.0',
# 'einops',
'vllm==0.4.0.post1', # my gold-ai-olympiad project uses 0.4.0.post1 ref: https://github.com/vllm-project/vllm/issues/2747
# pip install -q -U google-generativeai
"tqdm",
"openai",
"manifest-ml",
'beautifulsoup4',
# 'pandas',
'cvxpy',
# 'sklearn',The 'sklearn' PyPI package is deprecated, use 'scikit-learn' rather than 'sklearn' for pip commands.
# 'scikit-learn',
'snorkel',
'snorkel-metal',
'tensorboardX',
'pyyaml',
'TexSoup',
]
)
and the errors I get in cli bash:
(math_evaporate) brando9@skampere1~/massive-evaporation-4-math $ tree src_math_evaporate/
src_math_evaporate/
└── math_evaporate_llm_direct.py
0 directories, 1 file
(math_evaporate) brando9@skampere1~/massive-evaporation-4-math $ tree src_bm_evaporate/
src_bm_evaporate/
├── configs.py
├── evaluate_profiler.py
├── evaluate_synthetic.py
├── evaluate_synthetic_utils.py
├── massive_evaporate_4_math.egg-info
│ ├── dependency_links.txt
│ ├── PKG-INFO
│ ├── requires.txt
│ ├── SOURCES.txt
│ └── top_level.txt
├── profiler.py
├── profiler_utils.py
├── prompts_math.py
├── prompts.py
├── __pycache__
│ ├── configs.cpython-39.pyc
│ ├── prompts.cpython-39.pyc
│ └── utils.cpython-39.pyc
├── run_profiler_maf.py
├── run_profiler_math_evaporate.py
├── run_profiler.py
├── run.sh
├── schema_identification.py
├── snap_cluster_setup.egg-info
│ ├── dependency_links.txt
│ ├── PKG-INFO
│ ├── requires.txt
│ ├── SOURCES.txt
│ └── top_level.txt
├── utils.py
└── weak_supervision
├── binary_deps.py
├── __init__.py
├── make_pgm.py
├── methods.py
├── pgm.py
├── run_ws.py
└── ws_utils.py
4 directories, 34 files
(math_evaporate) brando9@skampere1~/massive-evaporation-4-math $ pip install -e .
Obtaining file:///afs/cs.stanford.edu/u/brando9/massive-evaporation-4-math
Preparing metadata (setup.py) ... error
error: subprocess-exited-with-error
× python setup.py egg_info did not run successfully.
│ exit code: 1
╰─> [8 lines of output]
running egg_info
creating /tmp/user/22003/pip-pip-egg-info-bqrbfkt8/massive_evaporate_4_math.egg-info
writing /tmp/user/22003/pip-pip-egg-info-bqrbfkt8/massive_evaporate_4_math.egg-info/PKG-INFO
writing dependency_links to /tmp/user/22003/pip-pip-egg-info-bqrbfkt8/massive_evaporate_4_math.egg-info/dependency_links.txt
writing requirements to /tmp/user/22003/pip-pip-egg-info-bqrbfkt8/massive_evaporate_4_math.egg-info/requires.txt
writing top-level names to /tmp/user/22003/pip-pip-egg-info-bqrbfkt8/massive_evaporate_4_math.egg-info/top_level.txt
writing manifest file '/tmp/user/22003/pip-pip-egg-info-bqrbfkt8/massive_evaporate_4_math.egg-info/SOURCES.txt'
error: package directory 'src_math_evaporate/weak_supervision' does not exist
[end of output]
note: This error originates from a subprocess, and is likely not a problem with pip.
error: metadata-generation-failed
× Encountered error while generating package metadata.
╰─> See above for output.
note: This is an issue with the package mentioned above, not pip.
hint: See above for details.
everything looks right to my. Why are the bugs happening?
I tried:
package_dir={
'': 'src_math_evaporate',
'bm_evaporate': 'src_bm_evaporate',
},
to
package_dir={
'math_evaporate': 'src_math_evaporate',
'bm_evaporate': 'src_bm_evaporate',
},
doesn't work. Both as root:
package_dir={
'': 'src_math_evaporate',
'': 'src_bm_evaporate',
},
Don't know what else to try. What do I do?
The error message is correct, there is no 'src_math_evaporate/weak_supervision' subdirectory. To fix use
setup(
name='massive-evaporate-4-math',
version='0.0.1',
package_dir={
'math_evaporate': 'src_math_evaporate',
'bm_evaporate': 'src_bm_evaporate',
},
...
)
Don't use packages=... in setup.py. Setup is perfectly capable of finding all packages without explicit value for packages here.
Btw- though not required for the build/install, it would be better to add __init__.py files to all the package and module directories. With the current directory structure, submodules are hard to discover.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With