The challenges in designing a library for PEP 425 (aka wheel tags)

If you have  ever looked at a project that has a lot of wheels (like numpy), you may have wondered what the part that comes after the project name and version mean. Well, they are known as platform compatibility tags and they are primarily defined in PEP 425. For someone like me whose personal projects are all written in pure Python, I never really paid much thought to what those tags meant since the wheel tags for my projects are all py3-none-any (you will find out what that means later in this post). So what led me from not caring to learning as much as I could about wheel tags and what did I learn along the way?

I get sucked into packaging projects too easily

For some reason I want to help Python end up with a good packaging story. Now I'm not deep into packaging because my day job doesn't require it and my work on Python itself is enough to keep me occupied with my open source time. But for some reason I still try to keep up with what is happening with the PyPA and that inevitably leads me into getting sucked into some new project on a semi-regular basis. A perfect example of this is PEP 518 and pyproject.toml: this didn't actually solve a problem I was having in general, I just knew it would be a good thing to have for packaging in the Python community and would help solve a problem that had existed for well over a decade.

One day I got it into my head that one should be able to script the installation of something from PyPI entirely via open source packages and their programmatic APIs which implemented some specification (in other words running pip through subprocess didn't count, nor did something that followed some old practice that wasn't covered by a PEP). And so I outlined the steps to go from package and version name to installed package and figured out where the gaps were. One of the gaps that I found in the outline was there was no way to easily tell what wheel to download from PyPI for a project as there wasn't a package to help with wheel tags. And so I dove in.

What  was the state of things when I started?

Digging into this and talking with PyPA folks I learned that there was a module named pep425tags which various projects used:

  1. zc.buildout
  2. wheel
  3. pip
  4. setuptools

Unfortunately every one of those projects had a vendored copy that had deviated from the original code. When looking at the APIs of all four versions it became apparent rather quickly that the module's feature API was rather small and most of the exposed API was actually for the module's own benefit. So that at least meant it wouldn't be difficult to come up with an API which should work for all projects.

As for semantics, it was recommended to me to look at pip's code since it was recently updated to deal with some bugs. And so I used pip's output from its copy of pep425tags as my starting point. But before I dive into what I found out I should probably explain how wheel compatibility tags work.

Compatibility tags

What a wheel filename is trying to communicate to you is what platform(s) it is compatible with without requiring that you open the file in any way. This is because reading a filename is much easier, faster, and something a human being can easily do from their OS compared to having to pull out some metadata contained within a wheel. It also prevents filename collisions in cases where a project has 22 wheel files like numpy.

As a consumer of wheels, what you're after is the most specific wheel which will work for your Python installation. You want that because the assumption is if a wheel is more specific it will perform better, provide more options, etc. than the most general wheel that your platform supports. It should also be noted that wheels represent what Python installation the wheel should be compatible with. There's always a possibility that a wheel won't work on a platform even if the tags match simply because there's a limit in how much information can be provided in a filename (and compatibility with compiled C code is surprisingly hard). You can also look at this from the inverse perspective and the wheel tags say what systems a wheel won't work on.

And to communicate what wheels are compatible with what Python installation setups, a tag triple is used.

Interpreter

The first tag is for the interpreter. This represents the Python interpreter version the wheel is compatible with. Well-known interpreters like CPython or PyPy have predefined abbreviations like cp and pp, respectively. That means that if you  see a tag for cp37 then it's considered compatible with CPython 3.7. For pp360 it means it's compatible with PyPy3 6.0. The version number is meant to represent the major and minor version of the interpreter version, hence why PyPy3 has a version number that  doesn't necessarily match the version of Python it implements.

Now obviously it's possible to write Python code that is interpreter-agnostic and these example tags don't reflect that. This is why there's a py abbreviation to represent the Python language; you can think of it like an abstract interpreter for a version of the language. So if a wheel has a tag of py37 then it's compatible with any interpreter that supports Python 3.7.

The py "interpreter" is also often used to represent just the major version of Python, like py3 for wheels that are not strictly tied to a major/minor version of Python. Remember earlier when I said that  wheels represent what should work for your Python installation? The py3 tag is a good example of that: it should work on Python 3, but if you're running a really old version like Python 3.0 the wheel is not guaranteed to work for you.

ABI

The next tag is the ABI required of the Python interpreter. This is traditionally a CPython-specific thing, but there's technically nothing stopping PyPy3 from declaring it supports CPython's API or even defining it's own ABI.

Specifically for CPython 3.7, you may see the following ABI tags:

  1. cp37*
    • cp37m
    • cp37dm
    • cp37d
    • cp37
  2. abi3
  3. none

The cp37* tags are the most common one for CPython-specific wheels, and specifically cp37m is what nearly all wheels on PyPI will be for CPython. The cp37 bit is just like in the interpreter tag: it represents the interpreter version that the wheel expects. The various letter suffixes mean the following:

  • m: compiled with pymalloc
  • d: compiled for  a debug build of CPython (we're currently discussing on python-dev how to make this less necessary, if at all)

There's also a u letter for Python 2.7, but since that won't be much an issue in less than 7 months as I write this, I won't bother going into it. Suffice it to say that Python 3 is better for a reason. 😊

The abi3 tag is for the stable ABI. Unfortunately that ABI isn't widely used as it could be due to us unfortunately making it an opt-out action originally to not have new APIs go into the stable ABI versus opt-in like it should have been. That has led to us accidentally expanding it on numerous occasions.

Finally, there's the none ABI which represent the case of not caring. This is typically seen with py interpreter tags since you shouldn't care about what ABI an interpreter supports if you're targeting just the Python language and not a specific interpreter.

Platform

The platform tag represents what operating system a wheel supports. Now OSs being OSs, what labels exist  and why for Windows, macOS, and Linux vary.

Windows

For Windows, there's just two:

  1. win_amd64
  2. win32

In other words the backwards-compatibility obsession that Windows has means you only have to care about whether you're running a 32- or 64-bit CPU. Simple.

macOS

For macOS, it's rather extensive due to the history of the OS. There's x86 or PowerPC CPU architectures. There's 32- versus 64-bit. There's whether the wheel contains a fat binary for 32-bit, 64-bit, Intel, or either CPU architectures.

And then there's the OS version. Certain versions support only certain bitness for CPUs. Same goes for architecture. This means that what version of macOS that a Python installation might support can be rather long. And this doesn't talk about shifts in macOS details which can shift what a wheel supports.

Luckily it isn't a big deal to support multiple macOS versions with a single wheel thanks to compressed tag sets (which will be discussed later in this post).

Linux

The hardest OS to support is definitely Linux. Unlike Windows which has a hardcore backwards-compatibility guarantee or macOS which updates as a single unit, Linux distributions all do their own thing and each are rather unique. This varies from updating packages aggressively and independently of other packages, as well as simply supporting different packages period. There's also the simple fact that Linux as an operating system is more of a concept than concrete thing that one can easily specify as a baseline build target for a wheel.

And so some very smart people came up with the manylinux concept. What this defines is a baseline Linux OS built on top of the oldest supported CentOS version available along with what libraries every Linux distribution would be reasonably expected to have. The idea is that if you can build with whatever old glibc version that the oldest supported CentOS has then any newer glibc will also work and thus work for your Linux distribution because there are very few Linux distributions with support lifetimes as long as CentOS. (The spec supports specifying 32-bit or 64-bit x86 support.)

Currently there's two versions of manylinux. The manylinux1 spec is in PEP 513 and the manylinux2010 spec is in PEP 571. Each target a different version of CentOS. There is an active discussion going on about how to handle the future of  the manylinux spec in terms of whether there should be a perennial manylinux definition or if tagging by year should continue (this stems from the fact it took literally years to get the manylinux2010 spec finished and supported by the appropriate tools). And if the year-style tagging stays there's also a discussion of how to make it more useful to more people as some wheels on PyPI actually don't follow the spec appropriately on purpose due to various issues their code has with the spec. In other words this is all being actively discussed in order to make it work better for everyone involved going forward.

Compressed tag sets

As I mentioned when discussing platform tags for macOS, there's a way to specify support for multiple tags in a single wheel file. Probably one of the most common compressed tag sets you will have seen is the py2.py3 interpreter tag. That dot signifies that the tag has multiple values that are all accurate. You will also see this tag used a lot for macOS as it's not hard to support multiple versions at once, e.g. macosx_10_6_intel.macosx_10_9_intel.macosx_10_9_x86_64.macosx_10_10_intel.macosx_10_10_x86_64. The key point is that you can have a single wheel represent compatibility with multiple tags without issue.

Tag triple priority quirks

With all of this knowledge, hopefully you can roughly follow what the following tag triples represent (this is an abbreviated list of what pip gave me for release build of CPython 3.7 on macOS 10.13; I've left out all the other versions of macOS that are also supported as that makes this list 515 entries long):

  • ('cp37', 'cp37m', 'macosx_10_13_x86_64')
  • ('cp37', 'abi3', 'macosx_10_13_x86_64')
  • ('cp37', 'none', 'macosx_10_13_x86_64')
  • ('cp36', 'abi3', 'macosx_10_13_x86_64')
  • ('cp35', 'abi3', 'macosx_10_13_x86_64')
  • ('cp34', 'abi3', 'macosx_10_13_x86_64')
  • ('cp33', 'abi3', 'macosx_10_13_x86_64')
  • ('cp32', 'abi3', 'macosx_10_13_x86_64')
  • ('py37', 'none', 'macosx_10_13_x86_64')
  • ('py3', 'none', 'macosx_10_13_x86_64')
  • ('py36', 'none', 'macosx_10_13_x86_64')
  • ('py35', 'none', 'macosx_10_13_x86_64')
  • ('py34', 'none', 'macosx_10_13_x86_64')
  • ('py33', 'none', 'macosx_10_13_x86_64')
  • ('py32', 'none', 'macosx_10_13_x86_64')
  • ('py31', 'none', 'macosx_10_13_x86_64')
  • ('py30', 'none', 'macosx_10_13_x86_64')
  • ('cp37', 'none', 'any')
  • ('cp3', 'none', 'any')
  • ('py37', 'none', 'any')
  • ('py3', 'none', 'any')
  • ('py36', 'none', 'any')
  • ('py35', 'none', 'any')
  • ('py34', 'none', 'any')
  • ('py33', 'none', 'any')
  • ('py32', 'none', 'any')
  • ('py31', 'none', 'any')
  • ('py30', 'none', 'any')

If you look at this list closely you will notice some interesting quirks. First is that the ABI varies first, but not in a consistent way. You will notice that the ABI varies while the cp37 interpreter tag stays consistent. But then after that, the interpreter tag goes from cp36 to cp30 while keeping the abi3 tag. And then after that it shifts to the py interpreter tag with the none ABI tag.

The next quirk is that goes from py37 to py3 to py36 and on down to py30. I believe the expectation here is that a wheel which claims py3 compatibility will be more compatible then e.g. py30 as chances are a py3 wheel has been developed to be compatible with newer versions of Python than py30.

The last quirk is ('cp3', 'none', 'any'). That honestly makes very little sense since writing C code that somehow works with any version of CPython and has no declared ABI would probably be a rather rare thing.

The end result (so far)

After diving into all of this I decided to try and get support for compatibility tags into the packaging package so that there was a central place to support these tags to minimize duplication of work and make it easier to spread proper tag support everywhere. It took a while, but at PyCon US 2019 my PR got merged. The reason I'm excited about this work is I think it will help with compatibility for everyone. I tweeted with excitement as I was working on this and seeing how many gaps I could fill in the tag triples:

That's a 66% increase for CPython 3.7 on my Mac and a 166% increase for PyPy3 on the same computer. I don't know when packaging will do a new release with the new packaging.tags submodule being included and when pip will pick up on it, but hopefully when that all happens users will be able to use wheels that they originally didn't know they could.