Unravelling augmented arithmetic assignment
Prologue
This post is part of a series on Python's syntactic sugar. The latest source code can be found as part of the desugar project.
Introduction
Python has something called augmented arithmetic assignment. If you're not familiar with that phrase, it's basically when you do some math while at the same time doing an assignment, e.g. a -= b
is augmented arithmetic assignment for subtraction. Augmented assignment was added to the language in Python 2.0.
Dissecting -=
Because Python does not allow for overriding assignment, how Python implements augmented assignment might not be quite what you're expecting compared to other operations that have a special/magic method.
First, know that a -= b
is the same as a = a - b
semantically. But also realize that if you know upfront that you're going to be assigning the same object to a variable name, you might be able to do something more efficient than a blind a - b
operation. For instance, probably the simplest application of this potential benefit is avoiding creating a new object: if you can mutate an object in-place then returning self
is a lot cheaper than constructing a new object from scratch.
As such, Python supports a __isub__()
method. If it's defined on the left side of the assignment (often called the lvalue) then it's called with the right-hand of the assignment (often called the rvalue). So for a -= b
, an attempt will be made to call a.__isub__(b)
.
Now, if that call results in NotImplemented
or simply doesn't exist, then Python falls back to a normal binary arithmetic operation: a - b
.
And regardless of which approach is used, the returned value gets assigned back to a
. As simplistic pseudocode, a -= b
breaks down to:
Generalizing the approach
Thanks to already having implemented binary arithmetic operations, generalizing augmented arithmetic operations isn't too complicated. By passing in the binary arithmetic operation function and doing some introspection on it (and any potentially raised TypeError
), it can be generalized rather nicely.
This makes defining support for -=
to be _create_binary_inplace_op(__sub__)
and everything else is inferred: the function name, what __i*__
function to call, and the callable to use for when the binary arithmetic operator is fallen back on.
How I discovered hardly anyone uses **=
While I was writing the code for this blog post I ended up getting odd test failures for **=
. In all the tests that made sure __pow__
was called appropriately as a fallback, the test failed when I ran against the operator
module included in Python's standard library. My code passed fine, but usually when there's a discrepancy between the code I wrote and what's coming from CPython it means I messed up somehow. But no matter how much I scrutinized my code to see how I was doing it wrong, I couldn't see why the test would pass for me but fail in the reference case.
I decided to dig a bit deeper to see what was going on in CPython itself. I started by disassembling the bytecode:
>>> def test(): a **= b
...
>>> import dis
>>> dis.dis(test)
1 0 LOAD_FAST 0 (a)
2 LOAD_GLOBAL 0 (b)
4 INPLACE_POWER
6 STORE_FAST 0 (a)
8 LOAD_CONST 0 (None)
10 RETURN_VALUE
That led me to INPLACE_POWER
in the eval loop:
That then led to PyNumber_InPlacePower()
:
Huh. So the code calls __ipow__
if it was defined, but it would only call __pow__
if __ipow__
was missing. What should have happened is if calling __ipow__
didn't work out due to NotImplemented
being returned or simply not existing, then __pow__
and __rpow__
should be called as appropriate. In other words the code was explicitly skipping the a ** b
fallback semantics by accident if __ipow__
existed!
This was actually partially noticed and filed as a bug almost 11 months ago. I revived the issue and started a conversation on python-dev about it. As of right now it looks like this will get fixed in Python 3.10 and we will need to add a notice in the documentation for 3.8 and 3.9 about the buggy semantics for **=
(the issue probably goes farther back, but older Python versions are in security-only maintenance mode so they won't get the documentation change). This very likely won't get backported as it is a change in semantics and could be rather hard to diagnose if someone is accidentally relying on the buggy semantics. But the fact that it took this long to notice suggests that **=
isn't used too extensively as taking the shortcut of implementing just __pow__
rather than __ipow__
would have caused someone to notice this sooner.