Poetry lock package banner

Poetry lock package is a simple tool to create a Python package from a Poetry project which pins down all the versions from the poetry.lock file as a dependency. This lock package allows you to reproduce your poetry environment with a single pip install.

Installation

The best way to install poetry-lock-package is to just add it to poetry using poetry add --dev poetry-lock-package. You could also use pip to install it, but as you are probably using poetry already adding it as a development dependency makes the most sense.

Usage

Here is an example of using the lock package if you have absolutely nothing:

poetry new example-package
cd example-package
poetry add cleo
poetry add --dev poetry-lock-package
poetry build
poetry run poetry-lock-package --build

The last two steps created the project wheel in dist and created a lock package in the dist folder. ls dist will show:

example_package-0.1.0-py3-none-any.whl
example-package-0.1.0.tar.gz
example_package_lock-0.1.0-py3-none-any.whl

The example_package_lock package will depend on the exact versions from the poetry.lock file and also depend on the exact version of example_package it was built for.

In any environment you can now pip install both wheels and pip will install all the required dependencies at the right version.

Using a lock package in docker

You can use this to simply build a docker file with all the correct versions installed, using:

FROM python:3-slim

WORKDIR /project

COPY dist/*.whl /

RUN pip install --no-cache-dir /*.whl \
    && rm -rf /*.whl

CMD ["python", "-c", "import example_package; print('Hello', example_package.__version__)"]

You can docker build, then docker run the above Dockerfile and you will have a docker image with the correct versions installed.

Using a lock package on Azure Databricks

If you want to deploy on Azure Databricks, build and publish both the lock package and the original package to an Azure DevOps artifact repository. Then you can install the package and all dependencies using the %pip magic command:

devops_pat = "your secret PAT, use dbutils.secrets.get in production"
package = "example-package-lock"
%pip install $package --index=https://build:$devops_pat@pkgs.dev.azure.com/your_org.../simple/

Now you might think that this would work, but there is an issue. The pyspark installation in Databricks is not registered as a pip package but instead just somewhere on your Python import path. This means that pip will happily install it's own version of pyspark, which is not what you want.

The solution is to skip all the libraries you want to use from the environment when you create the lock package. You can do this using the --ignore flag:

poetry run poetry-lock-package --build --ignore pyspark --ignore mlflow

This will ignore pyspark and mlflow. Both arguments are seen as regular expressions, allowing for --ignore arguments like tensorflow.* to ignore all types of tensorflow libraries.

Only the dependencies, not the parent package

If you want to mimic the requirements.txt approach of only installing the requirements and not the package itself, add the --no-root flag to the poetry-lock-package command. This will have the lock package not depend on the original package and allow you to install only the environment requirements and not the package itself.

This can be useful if you want to bootstrap a notebook environment to contain the proper dependencies, but not install the code from the original package for some reason.

Why not use requirements.txt?

A requirements.txt is a text spec of the requirements which you can export from poetry using poetry export --format requirements.txt. Pip supports requirements files which you can install using pip install --requirement requirments.txt.

This does not support a way to ignore some of the dependencies. You could try to use grep to filter out some of the requirements, but this won't work as expected because the requirements file represents a flattened tree of dependencies. For example, filtering out pyspark will not filter out the dependencies pyspark has. This means that you will make the mistake of pinning a specific py4j version.

Another, minor, issue with requirements.txt files is that they require a separate distribution channel. Getting packages to your environment is simple, use a private repository. Getting requirements.txt files requires you to host them in another accessible place, like exposing your git repository to your deployment cluster.

Q&A

Some questions and answers:

  • Should I share a lock package on pypi: NO. A lock package is meant to be for internal use because it restricts the environment to much to be usable in an environment not controlled by the creator of the lock package.
  • What about security upgrades?: pip will never automatically role out security updates/downgrades. Which means that it really depends on your project. If you need to allow for on-site upgrades you will have to setup matrix CI builds to check the different versions, and keep your version constraints in check. If you don't need them, do a single build, run poetry run safety check on a schedule, and re-lock and deploy if needed.

See also

Combining dependencies in a single file:

  • zipapp: Python support for dealing with zip embedded Python projects.
  • pex: Twitter tool to combine dependencies in a single file.
  • shiv: LinkedIn tool to combine dependencies in a single file.
  • pyinstaller: Create executable installer for everything you need.
  • briefcase: Create executable from dependencies.
  • Appimages: Relocatable embedded Python dependencies.
  • flatpak: Containerized desktop application installs.

License

For more information on licensing and to see the code, see the github page.

Github project