Looking at Python 3 usage in the astronomy community

Thomas Robitaille ran a survey on Python usage in the astronomy community and wrote up a great blog post on the results (a big thanks from me to Thomas for taking the time do to this!). The highlights of the survey results are:

  • 17% - 20% of respondents are using Python 3 depending on how you want to count
  • No one really uses any version other than Python 2.7 and 3.4
  • Windows users are better about using Python 3 than Linux and OS X users
  • Seasoned Python programmers are using Python 3 more than new users
  • The biggest reason people give for not using Python 3 is lack of incentives

Now this data is very much skewed towards astronomers using Python, but I still think it's worth looking at the results as a microcosm of the general community.

The amount of Python 3 adoption

I have seen PyPI numbers that provide a lower bound of 5% adoption across the community, while I have seen other numbers like the one Thomas found where in parts of the Python community that have had their popular libraries and frameworks ported the uptake is closer to 20% (I have primarily seen these kind of adoption numbers for the scientific and web communities). And while I would like to see the overall number be higher than 5% (which might be a little low due to how the PyPI numbers are calculated), I'm quite happy with the 20% penetration in communities who have decided to embrace Python 3.

You should probably drop support for Python 2.6 and 3.3

If you want an idea of how much skew there might be in the PyPI numbers due to caching and what version of Python the initial project fetch was done for, look at the fact that the PyPI numbers suggest that just under 30% of the community overall is on Python 2.6. Anecdotally from people I have talked to at PyCon suggests that number is a bit high, and Thomas' survey shows that at least in astronomy circles it's very high. This year I have begun to advocate that people drop Python 2.6 support as it was released on October 1, 2008 which is over 7 years ago and thus past the 5 year support period for security fixes from the Python development team. Python 2.7 also has nearly 5 years of bugfixes on Python 2.6 since 2.6.6's release on August 8, 2010. That means Python 2.7 has significantly less bugs, making it not only nicer to use but also easier to port to Python 3.

As for Python 3.3, the percentage of users from an absolute perspective is just too small. Both the PyPI numbers and Thomas' numbers suggest only about 1% of people use Python 3.3 directly. And with Python 3.5 due out in a few months you will want to only be supporting Python 3.4 and 3.5 anyway.

Windows users are the most up-to-date

Thomas' survey showed that Windows users are using Python 3 at a rate of about 2x to 3x the rate of Linux and OS X users, respectively. I bet this is a side-effect of Windows users having to install Python themselves, and thus grabbing the newest release. The implication of this is people do typically use the version of Python that comes installed on their system.

For OS X 10.10 users, they get Python 2.4, 2.6, and 2.7 installed by default. Now I have no clue when Apple will include Python 3 by default with OS X, so who knows how long the lack of Python 3 coming with OS X could go on for.

But for Linux users, this will start changing in 2016. Ubuntu plans to only ship Python 3 in 16.04 which also happens to be an LTS release. Fedora plans to only ship Python 3 in Fedora 23. That means the two most popular Linux distributions are going to make Python 3 what you get automatically and force you to install Python 2 manually if you want it in 2016.

Getting new users into Python 3

Thomas' data suggests that the people using Python 3 are not the people new to Python, but more seasoned users who know what Python 3 gets you. I suspect part of this ties into the system installation of Python not being Python 3 yet, and for some they just use whatever is already installed.

But for others it's possible they are being taught Python 2 over 3. In that instance I think it's a mistake. In my experience it's easier to teach Python 3 since it's a more consistent language and then teach exceptions to the rules for Python 2 use. Going the other way from Python 2 to learn what rough edges have been softened in Python 3 typically isn't as smooth for the new user.

Incentives for switching to Python 3

Communicating the benefits of Python 3 has always been difficult if you don't use Unicode. I have always said that once you have used Python 3 and had the sharp corners of Python 2 removed you simply won't want to use Python 2 unless you absolutely have to. Literally everyone I have ever talked to who gets to write code in pure Python 3 agrees with me; once you have fully switched to Python 3 you simply don't want to go back to Python 2.

Unfortunately because this is an instance of a lot of little things adding up to an overall improvement it's hard to point to any one single feature that makes people go, "ooh, I really want that" which causes them to make the switch. It's hard to say that keyword-only arguments, __pycache__ directories, or enhanced exceptions are good enough on their own to cause people to make the switch compared to Python 2. This is especially true for things from the stdlib which people have backported to Python 2, thus minimizing what is exclusive to Python 3 to only that which is based on syntax.

But features unique to Python 3 which can't be backported and can be motivation on their own started in Python 3.4 with asyncio combined with yield from. That was the first instance where there was a very clear benefit to a group of people -- those writing asynchronous code -- where switching to Python 3 would be truly useful, no questions asked (yes, trollius does exist but yield from makes asyncio nicer to work with).

But Python 3.5 is going to bring two big changes which simply won't work in Python 2. The first is matrix multiplication thanks to PEP 465. The numpy users will now have a syntax available to them for when they matrix multiply to arrays. This should make a large number of scientists very happy.

The second feature landing in Python 3.5 is async/await syntax thanks to PEP 492. This will essentially do away with the need to use yield from with asyncio or any other event loop for asynchronous programming and instead use syntax similar to that which is planned for/used in C#, Dart, ES7, etc. This should lead to a bunch more asynchronous programming in Python since the syntax makes it very explicit and easy to understand asynchronous programming.

I think 2016 is going to be a milestone in Python 3 adoption

I think the confluence of Linux distributions switching over to Python 3, syntactic features in Python 3.5 that won't be available in Python 2, all of the tooling now available to port to Python 3 (HOWTO and PyCon presentation), the end-of-life for Python 2.7 being less than 5 years away, and just simply time passing is going to lead to a nice bump in Python 3 adoption during 2016 that will hopefully be sustained going forward.