Below is a very rough-and-ready summary of Python dependency management tools I have used.
An (incomplete) list of important factors of dependency management tooling:
- Reproducibility - will committed files allow us to exactly reproduce an environment across dev machines, CI and production environments
- Speed of resolution - how fast can a complete environment be resolved given a set of primary dependencies
- Ease of management - how easy is it to upgrade one/all packages? how easy is it to add a new package? how easy is it to inspect the dependency graph?
- Availability of packages - can I get the packages I need?
- Environment size - How big is the resulting environment? If its smaller, its faster to distribute as container layers.
- Plain
pip
- pipenv
- poetry
- conda
- bazel (or similar e.g. Pants, Please, Buck)
- pybi and posy (hopefully soon!)
TODO: Add examples of common workflow operations for each tool.
pip
is fairly limited, and generally dependencies are maintained in a requirements.txt
file. Development and production dependencies are separated via multiple requirements-*.txt
files.
Dependency resolution is not too fast, but not too slow either. Since v20 its guaranteed correct across multiple dependencies (backtracking).
No built-in workflow for virtual environment management, or locking dependencies (but can be approximated with simple scripts).
Reproducible environments can be achieved by generating a lock file, but there is no built-in workflow for this in pip
.
Only supports Python packages, but these can include arbitrary binaries with negotiation for different platforms or building from source (see psycopg2-binary).
It is not possible to manage multiple Python versions with pip
.
Wrapper layer around pip
which adds a pre-defined workflow for:
- managing virtual environments
- generating a restoring from lockfiles
- upgrading transitive dependencies
- project-specific command runner (a la
make lint
) - supports dev and main dependency groups
Dependency resolution is basically the same as pip
in terms of speed, environment size and package availability, but it supports reproducible environments and dependency management is more straightforward.
Integrates with pyenv
to create environments with the correct Python version.
Primary goal is a replacement for setuptools
, and generally is used not just for developing, but also for building and publishing Python packages.
- Uses PubGrub dependency resolution, which is particularly fast.
- Utilises a lock-file automatically, which allows completely reproducible environments.
- Manages virtual environments, provides tools for upgrades etc (just like pipenv)
- Supports arbitrary dependency groups for e.g. specific tests vs production.
- Supports only Python packages (but wheels can include arbitrary compiled binaries with platform negotiation)
Integrates with pyenv
to create environments with the correct Python version.
Originally built to solve the following two problems not solved by pip
:
- Ensures all previous constraints are met (i.e. dependency 2 can't override dependency 1)
- Supports installing platform-specific non-Python dependencies
The former is now the default behaviour of pip
(since 20.0.0), and the latter has been partially, but not completely mitigated with support for packaging binaries or custom build systems in Python packages.
It does not create lockfiles allowing for environment reproduction (but this can be achieved by also using conda-lock), and does not support grouping dependencies for tests vs production.
Dependency resolution is quite slow (although perhaps faster with mamba
?). The conflict resolution is not great (perhaps also better with mamba
?) and doesn't provide any tools for inspecting dependencies.
It manages Python itself natively, and can install non-Python system packages (e.g. make
).
Bazel, and its many variations (Please, Pants, Buck etc) are full monorepo build systems. They work quite differently from any of the above, but are extremely powerful and fast. I can't really explain the full workings, but I think the key takeaway is that a Bazel
monorepo probably requires dedicated resource on an ongoing basis, so is likely not appropriate for smaller organisations.
See https://github.com/jacksmith15/bazel-python-demo for an example Python mono-repo set-up with Bazel.
Key advantages are:
- Lightning fast builds based of full caching of all inputs
- Selectively test, build based on changes (propagating through any affected transitive dependencies)
- Share dependencies between build targets (whilst still sandboxing each target) - reduces maintenance overhead as you only need to manage one dependency set
- ...much more