Poetry lock package is a simple tool to create a Python package from a Poetry project which pins down all the versions from the poetry.lock
file as a dependency. This lock package allows you to reproduce your poetry environment with a single pip install.
Installation
The best way to install poetry-lock-package is to just add it to poetry using poetry add --dev poetry-lock-package
. You could also use pip
to install it, but as you are probably using poetry already adding it as a development dependency makes the most sense.
Usage
Here is an example of using the lock package if you have absolutely nothing:
poetry new example-package
cd example-package
poetry add cleo
poetry add --dev poetry-lock-package
poetry build
poetry run poetry-lock-package --build
The last two steps created the project wheel in dist and created a lock package in the dist folder. ls dist
will show:
example_package-0.1.0-py3-none-any.whl
example-package-0.1.0.tar.gz
example_package_lock-0.1.0-py3-none-any.whl
The example_package_lock
package will depend on the exact versions from the poetry.lock
file and also depend on the exact version of example_package
it was built for.
In any environment you can now pip install both wheels and pip will install all the required dependencies at the right version.
Using a lock package in docker
You can use this to simply build a docker file with all the correct versions installed, using:
FROM python:3-slim
WORKDIR /project
COPY dist/*.whl /
RUN pip install --no-cache-dir /*.whl \
&& rm -rf /*.whl
CMD ["python", "-c", "import example_package; print('Hello', example_package.__version__)"]
You can docker build, then docker run the above Dockerfile
and you will have a docker image with the correct versions installed.
Using a lock package on Azure Databricks
If you want to deploy on Azure Databricks, build and publish both the lock package and the original package to an Azure DevOps artifact repository. Then you can install the package and all dependencies using the %pip
magic command:
devops_pat = "your secret PAT, use dbutils.secrets.get in production"
package = "example-package-lock"
%pip install $package --index=https://build:$devops_pat@pkgs.dev.azure.com/your_org.../simple/
Now you might think that this would work, but there is an issue. The pyspark
installation in Databricks is not registered as a pip
package but instead just somewhere on your Python import path. This means that pip
will happily install it's own version of pyspark
, which is not what you want.
The solution is to skip all the libraries you want to use from the environment when you create the lock package. You can do this using the --ignore
flag:
poetry run poetry-lock-package --build --ignore pyspark --ignore mlflow
This will ignore pyspark
and mlflow
. Both arguments are seen as regular expressions, allowing for --ignore
arguments like tensorflow.*
to ignore all types of tensorflow libraries.
Only the dependencies, not the parent package
If you want to mimic the requirements.txt
approach of only installing the requirements and not the package itself, add the --no-root
flag to the poetry-lock-package
command. This will have the lock package not depend on the original package and allow you to install only the environment requirements and not the package itself.
This can be useful if you want to bootstrap a notebook environment to contain the proper dependencies, but not install the code from the original package for some reason.
Why not use requirements.txt?
A requirements.txt is a text spec of the requirements which you can export from poetry using poetry export --format requirements.txt
. Pip supports requirements files which you can install using pip install --requirement requirments.txt
.
This does not support a way to ignore some of the dependencies. You could try to use grep
to filter out some of the requirements, but this won't work as expected because the requirements file represents a flattened tree of dependencies. For example, filtering out pyspark
will not filter out the dependencies pyspark
has. This means that you will make the mistake of pinning a specific py4j
version.
Another, minor, issue with requirements.txt files is that they require a separate distribution channel. Getting packages to your environment is simple, use a private repository. Getting requirements.txt files requires you to host them in another accessible place, like exposing your git repository to your deployment cluster.
Q&A
Some questions and answers:
- Should I share a lock package on pypi: NO. A lock package is meant to be for internal use because it restricts the environment to much to be usable in an environment not controlled by the creator of the lock package.
- What about security upgrades?:
pip
will never automatically role out security updates/downgrades. Which means that it really depends on your project. If you need to allow for on-site upgrades you will have to setup matrix CI builds to check the different versions, and keep your version constraints in check. If you don't need them, do a single build, runpoetry run safety check
on a schedule, and re-lock and deploy if needed.
See also
Combining dependencies in a single file:
- zipapp: Python support for dealing with zip embedded Python projects.
- pex: Twitter tool to combine dependencies in a single file.
- shiv: LinkedIn tool to combine dependencies in a single file.
- pyinstaller: Create executable installer for everything you need.
- briefcase: Create executable from dependencies.
- Appimages: Relocatable embedded Python dependencies.
- flatpak: Containerized desktop application installs.
License
For more information on licensing and to see the code, see the github page.