30 May 2023 4 min read packaging

In response to the Changelog #526

In episode 526 of the Changelog podcast entitled, "Git with your friends", they discussed various tools involving git (disclaimer: I have been on the podcast multiple times and had dinner with the hosts of the podcast the last time they were in Vancouver). Two the projects they discussed happened to be written in Python. That led Jerod to say:

The Python one gives me pause as well, just because I don’t know if it’s gonna go right.

🤨

Jerod and Adam know I tend to run behind in listening to my podcasts, so they actually told me to listen to the podcast and let them know what I thought, hence this blog post (they also told me to write a blog post when I asked where they wanted "my rant/reply", so they literally asked for this 😉).

To start, Jerod said:

If it’s pip install for me, I just have anxiety… Even though it works most of the time. It’s the same way – and hey, old school Rubyist, but if I see your tool and I see it’s written in Ruby, I’m kind of like “Uhm, do I want to mess with this?” And that’s how I am with Python as well. Their stories are just fraught.

To me, that's a red flag that Jerod is installing stuff globally and not realizing that he should be using a virtual enivonrment to isolate his tools and their dependencies. Now, I consider asking non-Python developers to create virtual environments to be a lot, and instead I would recommend using pipx. That allows one to install a Python-based tool in a very straightforward manner using pipx install into their .local directory. I also expect pipx to be available from all the major OS package managers, so installing it (and implicitly Python) shouldn't be too difficult.

If you don't want the tool you are running to be installed "permanently", pipx run will create a virtual environment in a temp directory so your system can reclaim the space when it wants to. This also has a side-effect of grabbing a newer version of the tool on occasion as the files will occasionally be deleted.

Another option is if projects provide a .pyz file. When projects do that, they are giving users a zip file that is self-contained in terms of Python code, such that you can just pass that to your Python interpreter to run something (e.g. python tool.pyz). That avoids any installation overhead or concerns over what Python interpreter you have installed at any point since you point any Python interpreter at the .pyz file (compatibility permitting).

For the pipx scenario we probably need projects to take the lead to write their documentation about this as non-Python developers quite possibly don't know about either option. The .pyz solution involves the project itself to build something as part of its release process which is also a bigger ask.

Jerod did provide a little bit of clarification later about what his concerns were:

Yeah. I have no problem with Ruby-based things. But if you say gem-install this tool, I’m like “You know what? I don’t really trust my Ruby environment over the course of years on my Mac”, and I’m the same way with Python. Whereas with Go, and with Rust, it seems - and JavaScript had the same bad story for me, but Deno with TypeScript is showing some new opportunities to have universal binaries, which is cool… I’m just way more likely to say “If you can just grab a binary, drop it in your path and execute it, I will do that 100 times a day.” But if your tool says PIP install, or it says gem install, or says npm install, I’m kind of like “Do I want to mess with this?” That’s just my sense.

So that does tie into the above guess that Jerod isn't using virtual environments. But you could stretch this out and say Jerod is even concerned that his Python interpreter will change or disappear, breaking any code he installed for that. In that instance, pipx run is rather handy as it will implicitly install Python if you got it from your OS package manager. You can also install Python explicitly instead of relying on it coming installed in the OS (which is now a Unix thing since macOS stopped shipping Python by default).

There is also the option of turning your Python tool into a standalone application. You can use things like Briefcase for GUI apps and PyApp for CLI apps. But this does make releasing harder as the project is now being asked to maintain builds for various operating systems instead of simply relegating that job to Python itself.

Now, Adam wanted even less work to do in order to get a tool:

if it’s on Linux, it should be in Apt, or whatever your [unintelligible 00:45:04.05] Yum, or pick your – it should be a package. Or you should have to update your registry with whatever package directory you want to use, and apt update, and get that, and install. That’s my feelings. I don’t like to PIP install anything if I don’t have to.

The trick with this is that you, the tool developer, do not have direct control as to whether an OS package manager picks up your tool to be packaged. Since you don't control that distribution mechanism there is no guarantee that you can get your tool into the package manager you want (e.g., Homebrew can choose to reject your project for inclusion).

The other problematic part is there's multiple OS package managers to get into, and that's even if you restrict yourself to the "major" Linux distributions:

And that's not covering the BSDs:

Windows:

or macOS:

Homebrew (which is also available on Linux)
MacPorts

And so supporting that installation flow of just installing from an OS package manager takes work as you're now coordinating your tool with multiple OSs which all have their own metadata format, requirements for inclusion, ways to be told about updates, etc. It's not a small lift to make happen if you want a large swath of coverage for your project.

Hopefully this has allievated some of Jerod's concerns and shown Adam that his ask isn't small. 😉 But what's the best approach for a Python tool project to take?

Unfortunately I don't know if there's a simple answer here. For instance, if people were to use PyApp to build a self-contained binary, would people download it and actually use it, or would they be like Adam and just look for what's available from their OS package manager? Where is the best cost:benefit ratio for each of the options suggested above where it warrants complicating your release process? I think documenting pipx and making a .pyz, if possible, available do make sense. But as to whether standalone binaries make sense or if it's a better use of time to try and get into the package managers I honestly don't know.

You might also like...