Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I have multiple src directories at the root of my python project with a setup.py and pip install -e?

I want to have two src dirs at the root of my project. The reason is that one is code I want to work without modifying any of the imports. The second is new code indepdent of the "old code". I want two src's with and pip install -e . to work. My setup.py is:

"""
python -c "print()"

refs:
    - setup tools: https://setuptools.pypa.io/en/latest/userguide/package_discovery.html#using-find-or-find-packages
    - https://stackoverflow.com/questions/70295885/how-does-one-install-pytorch-and-related-tools-from-within-the-setup-py-install
"""
from setuptools import setup
from setuptools import find_packages
import os

here = os.path.abspath(os.path.dirname(__file__))
with open(os.path.join(here, 'README.md'), encoding='utf-8') as f:
    long_description = f.read()

setup(
    name='massive-evaporate-4-math',  # project name
    version='0.0.1',
    long_description=long_description,
    long_description_content_type="text/markdown",
    author='Me',
    author_email='[email protected]',
    python_requires='>=3.9',
    license='Apache 2.0',

    # ref: https://chat.openai.com/c/d0edae00-0eb2-4837-b492-df1d595b6cab
    # The `package_dir` parameter is a dictionary that maps package names to directories.
    # A key of an empty string represents the root package, and its corresponding value
    # is the directory containing the root package. Here, the root package is set to the
    # 'src' directory.
    #
    # The use of an empty string `''` as a key is significant. In the context of setuptools,
    # an empty string `''` denotes the root package of the project. It means that the
    # packages and modules located in the specified directory ('src' in this case) are
    # considered to be in the root of the package hierarchy. This is crucial for correctly
    # resolving package and module imports when the project is installed.
    #
    # By specifying `{'': 'src'}`, we are informing setuptools that the 'src' directory is
    # the location of the root package, and it should look in this directory to find the
    # Python packages and modules to be included in the distribution.
    package_dir={
            '': 'src_math_evaporate',
            'bm_evaporate': 'src_bm_evaporate', 
        },

    # The `packages` parameter lists all Python packages that should be included in the
    # distribution. A Python package is a way of organizing related Python modules into a
    # directory hierarchy. Any directory containing an __init__.py file is considered a
    # Python package.
    #
    # `find_packages('src')` is a convenience function provided by setuptools, which
    # automatically discovers and lists all packages in the specified 'src' directory.
    # This means it will include all directories in 'src' that contain an __init__.py file,
    # treating them as Python packages to be included in the distribution.
    #
    # By using `find_packages('src')`, we ensure that all valid Python packages inside the
    # 'src' directory, regardless of their depth in the directory hierarchy, are included
    # in the distribution, eliminating the need to manually list them. This is particularly
    # useful for projects with a large number of packages and subpackages, as it reduces
    # the risk of omitting packages from the distribution.
    packages=find_packages('src_math_evaporate') + find_packages('src_bm_evaporate'),
    # When using `pip install -e .`, the package is installed in 'editable' or 'develop' mode.
    # This means that changes to the source files immediately affect the installed package
    # without requiring a reinstall. This is extremely useful during development as it allows
    # for testing and iteration without the constant need for reinstallation.
    #
    # In 'editable' mode, the correct resolution of package and module locations is crucial.
    # The `package_dir` and `packages` configurations play a vital role in this. If the
    # `package_dir` is incorrectly set, or if a package is omitted from the `packages` list,
    # it can lead to ImportError due to Python not being able to locate the packages and
    # modules correctly.
    #
    # Therefore, when using `pip install -e .`, it is essential to ensure that `package_dir`
    # correctly maps to the root of the package hierarchy and that `packages` includes all
    # the necessary packages by using `find_packages`, especially when the project has a
    # complex structure with nested packages. This ensures that the Python interpreter can
    # correctly resolve imports and locate the source files, allowing for a smooth and
    # efficient development workflow.

    # for pytorch see doc string at the top of file
    install_requires=[
        'fire',
        'dill',
        'networkx>=2.5',
        'scipy',
        'scikit-learn',
        'lark-parser',
        'tensorboard',
        'pandas',
        'progressbar2',
        'requests',
        'aiohttp',
        'numpy',
        'plotly',
        'wandb',
        'matplotlib',
        # 'statsmodels'
        # 'statsmodels==0.12.2'
        # 'statsmodels==0.13.5'
        # - later check why we are not installing it...
        # 'seaborn'
        # 'nltk'
        'twine',

        # # mercury: https://github.com/vllm-project/vllm/issues/2747
        # 'dspy-ai',
        # # 'torch==2.1.2+cu118',  # 2.2 net supported due to vllm see: https://github.com/vllm-project/vllm/issues/2747
        # 'torch==2.2.2',  # 2.2 net supported due to vllm see: https://github.com/vllm-project/vllm/issues/2747
        # # 'torchvision',
        # # 'torchaudio',
        # # 'trl',
        # 'transformers',
        # 'accelerate',
        # # 'peft',
        # # 'datasets==2.18.0', 
        # 'datasets',  
        # 'evaluate', 
        # 'bitsandbytes',
        # # 'einops',
        # # 'vllm==0.4.0.post1', # my gold-ai-olympiad project uses 0.4.0.post1 ref: https://github.com/vllm-project/vllm/issues/2747

        # ampere
        'dspy-ai',
        # 'torch==2.1.2+cu118',  # 2.2 net supported due to vllm see: https://github.com/vllm-project/vllm/issues/2747
        'torch==2.1.2',  # 2.2 net supported due to vllm see: https://github.com/vllm-project/vllm/issues/2747
        # 'torchvision',
        # 'torchaudio',
        # 'trl',
        'transformers==4.39.2',
        'accelerate==0.29.2',
        # 'peft',
        # 'datasets==2.18.0', 
        'datasets==2.14.7',  
        'evaluate==0.4.1', 
        'bitsandbytes== 0.43.0',
        # 'einops',
        'vllm==0.4.0.post1', # my gold-ai-olympiad project uses 0.4.0.post1 ref: https://github.com/vllm-project/vllm/issues/2747
        # pip install -q -U google-generativeai

        "tqdm",
        "openai",
        "manifest-ml",
        'beautifulsoup4',
        # 'pandas',
        'cvxpy',
        # 'sklearn',The 'sklearn' PyPI package is deprecated, use 'scikit-learn' rather than 'sklearn' for pip commands.
        # 'scikit-learn',
        'snorkel',
        'snorkel-metal', 
        'tensorboardX',
        'pyyaml',
        'TexSoup',
    ]
)

and the errors I get in cli bash:

(math_evaporate) brando9@skampere1~/massive-evaporation-4-math $ tree src_math_evaporate/
src_math_evaporate/
└── math_evaporate_llm_direct.py

0 directories, 1 file
(math_evaporate) brando9@skampere1~/massive-evaporation-4-math $ tree src_bm_evaporate/
src_bm_evaporate/
├── configs.py
├── evaluate_profiler.py
├── evaluate_synthetic.py
├── evaluate_synthetic_utils.py
├── massive_evaporate_4_math.egg-info
│   ├── dependency_links.txt
│   ├── PKG-INFO
│   ├── requires.txt
│   ├── SOURCES.txt
│   └── top_level.txt
├── profiler.py
├── profiler_utils.py
├── prompts_math.py
├── prompts.py
├── __pycache__
│   ├── configs.cpython-39.pyc
│   ├── prompts.cpython-39.pyc
│   └── utils.cpython-39.pyc
├── run_profiler_maf.py
├── run_profiler_math_evaporate.py
├── run_profiler.py
├── run.sh
├── schema_identification.py
├── snap_cluster_setup.egg-info
│   ├── dependency_links.txt
│   ├── PKG-INFO
│   ├── requires.txt
│   ├── SOURCES.txt
│   └── top_level.txt
├── utils.py
└── weak_supervision
    ├── binary_deps.py
    ├── __init__.py
    ├── make_pgm.py
    ├── methods.py
    ├── pgm.py
    ├── run_ws.py
    └── ws_utils.py

4 directories, 34 files
(math_evaporate) brando9@skampere1~/massive-evaporation-4-math $ pip install -e .
Obtaining file:///afs/cs.stanford.edu/u/brando9/massive-evaporation-4-math
  Preparing metadata (setup.py) ... error
  error: subprocess-exited-with-error
  
  × python setup.py egg_info did not run successfully.
  │ exit code: 1
  ╰─> [8 lines of output]
      running egg_info
      creating /tmp/user/22003/pip-pip-egg-info-bqrbfkt8/massive_evaporate_4_math.egg-info
      writing /tmp/user/22003/pip-pip-egg-info-bqrbfkt8/massive_evaporate_4_math.egg-info/PKG-INFO
      writing dependency_links to /tmp/user/22003/pip-pip-egg-info-bqrbfkt8/massive_evaporate_4_math.egg-info/dependency_links.txt
      writing requirements to /tmp/user/22003/pip-pip-egg-info-bqrbfkt8/massive_evaporate_4_math.egg-info/requires.txt
      writing top-level names to /tmp/user/22003/pip-pip-egg-info-bqrbfkt8/massive_evaporate_4_math.egg-info/top_level.txt
      writing manifest file '/tmp/user/22003/pip-pip-egg-info-bqrbfkt8/massive_evaporate_4_math.egg-info/SOURCES.txt'
      error: package directory 'src_math_evaporate/weak_supervision' does not exist
      [end of output]
  
  note: This error originates from a subprocess, and is likely not a problem with pip.
error: metadata-generation-failed

× Encountered error while generating package metadata.
╰─> See above for output.

note: This is an issue with the package mentioned above, not pip.
hint: See above for details.

everything looks right to my. Why are the bugs happening?

I tried:

    package_dir={
            '': 'src_math_evaporate',
            'bm_evaporate': 'src_bm_evaporate', 
        },

to

    package_dir={
            'math_evaporate': 'src_math_evaporate',
            'bm_evaporate': 'src_bm_evaporate', 
        },

doesn't work. Both as root:

    package_dir={
            '': 'src_math_evaporate',
            '': 'src_bm_evaporate', 
        },

Don't know what else to try. What do I do?

like image 998
Charlie Parker Avatar asked Dec 07 '25 16:12

Charlie Parker


1 Answers

The error message is correct, there is no 'src_math_evaporate/weak_supervision' subdirectory. To fix use

setup(
    name='massive-evaporate-4-math',
    version='0.0.1',
    package_dir={
        'math_evaporate': 'src_math_evaporate',
        'bm_evaporate': 'src_bm_evaporate',
    },
...
)

Don't use packages=... in setup.py. Setup is perfectly capable of finding all packages without explicit value for packages here.

Btw- though not required for the build/install, it would be better to add __init__.py files to all the package and module directories. With the current directory structure, submodules are hard to discover.

like image 169
mudskipper Avatar answered Dec 09 '25 21:12

mudskipper



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!