My experience with type hints and mypy

The CLA bot for the PSF is designed defensively because if the bot accidentally lets a pull request through from someone that has not signed the CLA that could lead to legal troubles. To alleviate any worries I may have about bugs lurking in the code I have made sure that the CLA bot's code is thoroughly tested. I use Travis to make sure that continuous integration is passing, I use Codecov and to make sure that there's 100% branch coverage, and the bot does not deploy to Heroku unless CI is passing (aside: thanks to Heroku for donating free hosting to the PSF which I'm taking advantage of for the bot).

But one thing I had not taken advantage of until today to help code defensively is type hints and mypy. I didn't do this from the outset because mypy didn't support async functions when the CLA bot was initially written. But with the advent of variable annotations in Python 3.6 and mypy's support for async functions I thought I would see what it was like to add type hints to pre-existing code.

What worked

Following the general approach outlined by the Dropbox team during their Dec 2016 BayPiggies talk, which mirrors what Zulip outlined in October 2016, worked out well. Basically you run mypy with no types to make sure it won't trip over anything, and then you slowly add types, one object at a time. Since mypy only types things that have been given types you don't have to worry about mypy over-reaching and making false-positives. This gives you a nice iterative process where you can don't have to convert all of your code at once.

What didn't work

Unfortunately mypy isn't ready to take full advantage of Python 3.6. Now for most people this won't be a problem, but if you're not aware of this it can trip you up. For instance, even though typing.Collection exists, typeshed doesn't support the class. And because of how mypy is structured, if typeshed doesn't have something from the typing module it will claim it doesn't exist. In the end I was able to work around this by using typing.AbstractSet, but it was a bit frustrating to not get to fully use all the types available in Python 3.6.

You also can't use f-strings in mypy yet (I've been told they're coming). Since mypy has to mirror so much of the Python internals spanning Python 2 & 3 it hasn't had its parser updated yet to handle f-strings. Luckily it's coming, but it would have been nice if support was available when Python 3.6 was released (which is not a criticism since the mypy team has only so much time and their own priorities).

I did end up skipping the type hinting of the test suite to avoid the work. When you're faking things out and using types that you know you should not normally be passed in it leads to a lot of type errors (all of which were legitimate, but I simply did not care). I could have updated the test code to pass the appropriate type, but I was lazy. I also could have loosened the type hints to be more permissive, but I did not think that was the best solution due to my laziness. (It's now an open issue to resolve this.)

Did I get anything out of this?

You can look at the pull request which added type hints. While mypy did find a couple bugs, all of them would have been found by most linters anyway.

What mypy really got me was better documentation. While I was adding the type hints there were a couple of times where I had to examine the code to realize what the appropriate type was. Now that I have the functions and methods all hinted I don't have to guess anymore. That should make long-term maintenance a bit easier. And I don't think the code reads poorly because of the type hints so I don't think there's a penalty there. This is also useful for the CLA bot as it's entirely abstracted out into a few abstract base classes to make swapping out any server that it communicates with easy; having type hints means mypy verifies the type hinting contract between ABCs and their subclasses.

After having gone through the experience, would I bother typing new Python 3 code? My answer is yes once mypy supports f-strings. When I design an API I already have to think about what type of objects would be acceptable, so quickly writing down my assumptions doesn't hurt anything, it's relatively quick, and it benefits anyone having to work with my code. But I also wouldn't contort my code to fit within the confines of type hints (i.e. if type hints forces me to write cleaner code then that's great, but if something is so dynamic that it can't have type hints then that's fine and I'll happily use typing.Any as an escape hatch).

In the end I view type hints as enhanced documentation that has tooling to help verify that the documentation about types is accurate. And for that use-case I see type hints worth doing and not at all a burden.