Why pylock.toml includes digital attestations

A Python project got hacked where malicious releases were directly uploaded to PyPI. I said on Mastodon that had the project used trusted publishing with digital attestations, then people using a pylock.toml file would have noticed something odd was going on thanks to the lock file including attestation data. That led to someone asking for a link to something to explain what I meant. I didn't have a link handy since it's buried in 4 years and over 1,800 comments of discussion, so I figured I would write a blog post. 😁

Since trusted publishing is a prerequisite for digital attestations, I'll cover that quickly. Basically you can set a project up on PyPI such that a continuous deployment (CD) system can upload a release to PyPI on your behalf. Since PyPI has to trust the CD system to do security right as it lets other sites upload to PyPI on your behalf, not every CD system out there is supported, but the big ones are and others get added as appropriate. Since this helps automate releases without exposing any keys to PyPI that someone might try to steal, it's a great thing to make your life as a maintainer easier while doing something safer; win-win!

Digital attestations are a way for a CD system to attest that a file came from that CD system. That's handy as once you know where a file should come from you can verify that fact to make sure nothing nefarious is going on. And since this is just a thing to flip on, it's extremely simple to do. If you use the official PyPA publish action for GitHub Actions, you get it automatically. For other CD systems it should be a copy-and-paste thing into your CD configuration.

Now, the thing that pylock.toml records is who the publisher is for a file. Taking packaging as an example, you can look at the provenance for packaging-26.0-py3-none-any.whl that comes from the digital attestation and you will notice it tells you the file came from GitHub via the pypa/packaging repo, using the publish.yml workflow run in the "pypi" environment (which you can also see via the file's details on PyPI):

"publisher": {
  "environment": "pypi",
  "kind": "GitHub",
  "repository": "pypa/packaging",
  "workflow": "publish.yml"
}

So what can you do with this information once it's recorded in your pylock.toml? Well, the publisher details are stored for each package in the lock file. That lets code check that any files listed in the lock file for that package version were published from the same publisher that PyPI or whatever index you're using says the file came from. So if the lock file and index differ on where they say a file came from, something bad may have happened.

What can you do as a person if you don't have code to check that things line up (which isn't a lot of code; the lock file should have the index server for the package, so you follow the index server API to get the digital attestation for each file and compare)? There are two things you can do manually. One, if you know that a project uses trusted publishing then that digital attestation details should be in the lock file (you can manually check by looking at the file details on PyPI); if it's missing or changed to something suspicious then something bad may have happened. Two, when looking at a PR to update your lock file (and pylock.toml was designed to be human-readable), if digital attestation details suddenly disappear then something bad probably happened.

So to summarize:

  1. Use trusted publishing if you're a maintainer
  2. Upload digital attestations if you're a maintainer
  3. Use lock files where appropriate (and I'm partial to pylock.toml 😁)
  4. If you're using pylock.toml have code check the recorded attestations are consistent
  5. When reviewing lock file diffs (which you should do!), make sure the digital attestations don't look weird or were suddenly deleted

A special thanks to William Woodruff, Facundo Tuesca, Dustin Ingram, and Donald Stufft for helping to make trusted publishers and digital attestations happen.