Selecting a programming language can be a form of premature optimization
Have you ever been told that Python couldn't be used for a project because it wouldn't be fast enough? I have, and I find it a bit frustrating as big banks, YouTube, Instagram, and plenty of other places that are performance-sensitive still manage to select Python and be happy.
And that's when it dawned on me that the problem is people are not treating language selection as a potential form of premature optimization: if you select a programming language based on your preconceived notions of how a language performs, you will never know if the language that might be a better, more productive fit for your developers would have actually worked out.
And so this blog post is going to argue that Python makes sense to select even for projects with performance concerns and how to work towards better performance in an iterative fashion if your first attempt isn't fast enough. The general steps, which you can stop at any point your needs are met, and should include profiling between steps, are:
- Prototype in Python.
- Optimize your data structures and algorithms.
- Try another Python implementation (that doesn't require many code changes).
- Use Python's language bindings to optimize using another language.
While this might seem like a lot of work, do remember that Python is oriented towards productivity. This is what leads to numerous anecdotes of where someone implemented something in Python in 1/3 the time a competing team did creating the same thing in e.g. C++ or Java. By the time the competing team completes their v1 to a beta level, the Python implementation is often on v3 in production with extensive testing and has been optimized enough to match the initial performance of the other implementation that chose a "faster" programming language.
Prototype in Python
You have to start somewhere. 😉 Thanks to Python being designed to make you productive, you should hopefully be able to get something working pretty quickly. It's actually possible this will be fast enough and you're already done. 😁
Optimize your data structures and algorithms
If your performance isn't where you want it to be, there's always making your code simply more efficient (after profiling). This is beneficial as it cascades into the following steps (if they are needed). Using a profiler to figure out where to optimize can be enough to get the performance gains you're after.
Try another Python implementation
Now when I say "implementation", I'm using the term very loosely. What I mean here is something which doesn't require rewriting code in order to get your performance improvement but does go beyond the Python implementation you initially chose; something that's extremely cheap and simple to try out to see if it gets you the performance improvement you're after. This includes things like:
As before, you will want to profile your code to see which options may work best (e.g., if your hot spots are not numeric then Numba is probably not a good fit).
Use language bindings
This blog post came about because of a tweet about using Rust to speed up some Python code. Thanks to Python's long history of being great "glue" code, it has ended up with a myriad of ways to call out to other languages that may be able to operate faster than Python itself:
... and the list goes on for various languages and tools. The key difference compared to the previous option is you will have to write a bit of code. But if your performance needs are that critical, this should get you what you need (after your profile; you want to minimize how much code you feel the need to rewrite). Heck, you can start down this road and eventually replace all of your Python code, but you will have validated the algorithms, design, and potentially already have a test suite written in Python that you can use to validate your work indefinitely thanks to your initial Python version.
Consider optimizing for developer time, not computation costs
I think the jump to selecting a programming language based on potential performance needs often comes from a place where people think their computation costs are more important to optimize for than their developer time. I don't think that always holds, though, as software developers are expensive. If you look at your cloud hosting costs, for instance, and then look at how many developers that could have paid for, my guess is that if you selected a programming language that required less staff you would save more by lowering your payroll than by trying to squeeze out every bit of your hosting costs.
Another way to look at it is computation cost is a race to zero while developer salaries are going in the opposite direction. Cloud hosting firms want your business, and so they have incentives to make their services as cheap as possible while providing you the services you want. But developers want as much money as you're willing to pay, and unfortunately for employers there is massive demand for developers; salaries are not going to be dropping any time soon.
You also have to realize that Python and all the libraries I mentioned above are continuously improving. For instance, CPython 3.11 is already faster than 3.10 by a good amount. That trend will continue, so in October 2022 you will very likely get a performance increase automatically once you upgrade (and if you would like to see that continue, please consider donating to the PSF). The same can't be said about what you pay developers and getting more, better code out of them.
All of this is to say while some companies do extract massive value from squeezing every CPU cycle out of their code, those companies also typically build data centres. So if you don't need a dedicated building to host your machines, please consider doing the math to see if it's truly worth making your developers less productive in the name of computational efficiency when you don't even know if that perceived efficiency is even necessary.