Thoughts on where tools fit into a workflow

I am going to admit upfront that this is a thought piece, a brain dump, me thinking out loud. Do not assume there is a lesson here, nor some goal I have in mind. No, this blog post is providing me a place to write out what tools I use when in my ideal development workflow (and yes, this will have a bias towards the Python extension for  VS Code 😁).

While actively coding

The code-test-fix loop

Typically when I am coding I think about what problem I'm trying to solve, what  the API should look like, and then what it would take to test it. I then start to code up that solution, writing tests as I go. That means I have a virtual environment set up with the latest version of Python and all relevant required and testing-related dependencies installed into it. I am also regularly running the test I am currently working on or the related tests I have to prevent any regressions. But the key point is a tight development loop where I'm focusing on the code I'm actively working on.

The tools I'm using the most during this time is:

  1. pytest
  2. venv (although since virtualenv 20 now uses venv I should look into using virtualenv)

Making sure I didn't mess up

Once code starts to reach a steady state and the design seems "done", that's when I start to run linters and to expand the testing to other versions of Python. I also start to care about test coverage. I put this off until the code is "stable" to minimize churn and the overhead cost of running a wider amount of tools and have to await their results which slow down the development process.

Now, I should clarify that for me, linters are tools that you run to check your code for something which do not require running under a different version of Python. If you have to run something under every version of Python that you support then that's a test to me, not a lint. This allows me to group linters together and run them only once instead of under every version of Python with the tests, cutting the execution time down.

The tools that I am using during this time are:

  1. coverage.py
  2. Black
  3. mypy
  4. I should probably start using Pyflakes (or flake8 --ignore=C,E,W)

Running these three tools all the time can be a bit time-consuming. I have to remember to do it and they don't necessarily run quickly. Luckily I can amortize the costs of running linters thanks to support in the Python extension for VS Code. If I set up the linters to run when I save, I can have them running regularly in the background and not have to do the work they will ask me to do later. Since the results show up as I work without having to wait for a manual run it makes it much cheaper to run linters. Sames goes for setting up formatters (which also act as linters when you're enforcing style).

The problem is not everyone uses VS Code. To handle the issue of not remembering what to run, people often set up tox or nox which also has the benefit of making it easier to run tests against other versions of Python. Another option is you can also set up pre-commit so as to not forget and get the benefit of linting for other things like trailing whitespace, well-formed JSON, etc. So there's overlap between tox/nox and pre-commit, but also differentiators. This leads some people to set up tox/nox to execute pre-commit for linting to get the most that they can out of all the tools.

So tools people use to run linters:

  1. tox or nox
  2. pre-commit

But then there is also the situation where people have their own editors that they want to set up to using these linters. And so when using build tools like poetry and flit they have the concept of development dependencies. That way everyone working on the project get the same tools installed and they can set them up however they want to fit their workflow.

Proposing a change

When getting ready to create a pull request, I want the tests and linters to run against all supported versions of Python and OSs via continuous integration. To make things easier to debug when CI flags a problem, I want my CI to be nothing more than running something I could run locally if I had the appropriate setup. I am also of the opinion that people proposing PRs should do as much testing locally as possible which requires being able to replicate CI runs locally (I hold this view because people very often don't pay attention to whether CI for their PR goes green or not and making the maintainer have to message you saying your PR is failing CI adds delays and takes up time).

There is one decision to make about tooling updates. Obviously tools like the linters that you  rely on will make new releases and chances are you want to use them (improved error detection, bugfixes, etc.). There are two ways of handling this.

One is to leave the development dependencies unpinned. Unfortunately that can lead to an unsuspecting contributor having CI fail on their PR simply because a development dependency changed. To avoid that I can run a CI cron job at some frequency to try and pick up those sorts of failures early on.

The other option is to pin my  development dependencies (and I truly mean pin; I have had micro releases break CI because a project added a warning and a flag was set to make warnings be considered errors). This has the side-effect that in order to get those bugfixes and improvements from the tools I will need to occasionally check for updates. It's possible to use tools like Dependabot to update pinned dependencies in an automated fashion to alleviate the burden.

Tools for CI:

  1. GitHub Actions
  2. Dependabot

Preparing for a release

I want to make sure CI testing against the wheel that you would be uploading to PyPI (setuptools users will know why this is important thanks to MANIFEST.in). I want the same OS test coverage as testing a PR request. For Python versions, I will test against all supported versions plus the in-development version of Python where I allow for failures (see my blog post on why this is helpful and how to do it on Travis).

With testing and linting being clean, that leaves release-only prep work. I have to update the version if I haven't been doing that continuously. The changelog will also require updating if I haven't been doing it after every commit. With all of this in place I should be ready to build the sdist and wheel(s) and uploading them to PyPI. Finally, the release needs to be tagged in git.

Conclusion (?)

Let's take setting up Black for formatting. That would mean:

  1. List Black as a development dependency
  2. Set up VS Code to run Black
  3. Set up pre-commit to enforce Black
  4. Set up tox or nox to run pre-commit
  5. Set up GitHub Actions to lint using tox or nox

What about mypy?

  1. List mypy as a development dependency
  2. Set up VS Code to run mypy
  3. Set up pre-commit to enforce mypy

Repeat as necessary for other linters. There's a bit of repetition, especially considering how I set up Black will probably be the same as all of my other projects and very similar to other people. And if there is an update to a linter?

  1. Update pre-commit
  2. Potentially update development dependency pin

There's also another form of repetition when you add support for a new version of Python:

  1. Update your Python requirement for build back-end
  2. Update your trove classifiers
  3. Update tox or nox
  4. Update GitHub Actions

Once again, how I do this is very likely the same for all of my projects and lots of other people.

So if I'm doing the same basic thing for the same tools, how can I cut down on this repetition? I could use Cookiecutter to stamp out new repositories with all of this already set up. That does have the drawback of not being able to update things later. Feels like I want a Dependabot for linters and new Python versions.

I also need to automate my release workflow. I've taken a stab at it, but it's not working quite yet. If ditched SemVer for all my projects it would greatly simplify everything. 🤔