A tutorial on packaging up your Python code for PyPI

On Wednesday, October 25th, I gave a workshop at PyLadies Vancouver on how to package up your code and put it up on PyPI. Since this workshop came about by request, I figured others out there who couldn't attend might be interested in the same information. (The official documentation for packaging in Python can always be found at packaging.python.org.)

Building versus installing

To start, I want to make clear exactly what I mean when I say I'm going to talk about "packaging up" your code. Unfortunately the term "package" is an overloaded term in Python as it means both a noun for a namespace that you put modules into as well as the verb of bundling up some code for easy distribution; in this post I'm going to be talking about the latter. Now I will focus on the use-case of putting code up on PyPI, but the tools outlined and general guidelines laid out below can also apply to private distribution of library code.

I also want to be clear that I'm saying "library code" on purpose as this isn't about distributing an application. For instance, this post is not for talking about how to deploy to the cloud using a requirements.txt file or shipping a self-contained zip file using zipapp for a CLI app. Put another way, I'm talking about the things pip will install for you when you are creating your app.

Build artifacts

There are two file formats that library code can come in for distribution: wheels and sdists. A wheel is a specially-formated zip file whose contents are pre-built, including extension modules. What this means is that if pip can install the appropriate wheel for your Python interpreter and platform then all it really has to do is unzip the wheel file and copy the files to the right location; no compilation required (which makes installation really fast)! It also means that pip can just cache the wheel for later use since everything is ready to go for installation on your machine. And this support for pre-built wheels on PyPI extends even to Linux thanks to the manylinux project, so that the three major operating systems are covered.

But there is a drawback to wheels when extension modules are involved and you're the one providing the wheels: you can end up needing to make a lot of wheels to cover the majority of platforms and Python verions people use. For instance, if you look at the download files for numpy you will notice they were nice enough to provide 22 different wheels. These cover different combinations of Python version, ABI support, OS, and CPU architecture. But there is always a possibility you will be on a platform that doesn't have a wheel available or as a package maintainer you may not have a way to provide a wheel for a certain platform (e.g. Alpine Linux is not compatible with manylinux wheels).

For those situations where a wheel isn't available, you can turn to a source distribution (a.k.a sdist). An sdist has no real specification, but basically it's a .tar.gz file which contains all the files necessary to build a wheel (note to Windows users: I know it's a slight pain that the format is usually .tar.gz and not .zip, but don't forget you can always use python3 -m tarfile to work with a .tar.gz file). What this means is that if pip can't find a wheel that works for you, it can grab the sdist instead and build a custom wheel just for you.

How to package up/build your code today

When it comes to packaging up your library code into wheels and an sdist, whether there are extension modules in your code determine what the best build tool is for you to use.

Pure Python code

If your library only contains pure Python code, then you should use Flit. While people have traditionally used setuptools for this, it's typically overkill when all you're doing is packaging up a bunch of .py files. Flit does a good job of leaning on practices you are probably already doing (e.g. keeping your source code in version control, providing a docstring for your package), thus minimizing how much you need to provide in a Flit-specific manner.

Because Flit is so straight-forward to use, there really isn't any magical bit of advice I have to provide for using it beyond make sure to read the documentation on the flit.ini file as it has more fields than shown in the simple example on the documentation home page.

Extension modules

If your library happens to contain at least one extension module, then Flit won't work for you as a build tool. In this instance you will want to use setuptools. You can follow the current tutorial on packaging.python.org. The one piece of advice I have is don't forget that you will probably want 3 files in the end for your packaging configuration:

  1. setup.py
  2. setup.cfg
  3. MANIFEST.in

If you want to be a bit more adventurous, enscons is another build tool for extension modules that relies on SCons. This has a perk of using a more general build tool to drive building extension modules, which leads to benefits like only rebuilding when source files have changed.

What the (near) future holds

At this point I should admit that my earlier definition an sdist was woefully under-specified. When I said an sdist "contains all the files necessary to build a wheel" what I should have mentioned is that implies there's a setup.py file included in the sdist. This is because setuptools has long been the de-facto build tool for the Python community, leading to pip and other tools to run python setup.py bdist_wheel to build a wheel from an sdist (by injecting setuptools the wheel package). This isn't much of a specification, though, as it implicitly means you're expecting things to operate however setuptools decided things should operate.

To decouple the Python packaging ecosystem from setuptools, two PEPs have been written and accepted. PEP 518 defines a pyproject.toml file which lets package maintainers specify what build tool(s) they depend on (it also allows build tools to store configuraiton information in the pyproject.toml file as well). PEP 517 provides a way to specify how to execute the build tool. Combined, these PEPs let a package specify how they want to be built, making setuptools no longer be the de-facto/only build tool people use and opening the ecosystem up to put alternative build tools like Flit on equal footing.

At this point both PEPs are waiting to be implemented by various tools. The work has begun, though, so hopefully it won't be too long into the future when being able to choose the right build tool for your project will be possible.