Unravelling rich comparison operators
For the next part of my blog series on pulling apart Python's syntactic sugar, I'm going to be tackling rich comparison operators: ==
, !=
, >
, <
, >=
, <=
.
For this post I am going to be picking apart the example of a > b
.
Looking at the bytecode
Using the dis
module, we can look at the bytecode that CPython generates:
That points us at the COMPARE_OP
opcode. Its implementation sends us to the cmp_outcome()
function who delegates all the heavy lifting to PyObject_RichCompare()
.
How rich comparisons work
With PyObject_RichCompare()
delegating to do_richcompare()
, the code matches up to the explanation in the data model. Each comparison operator has a matching special/magic method:
==
:__eq__
!=
:__ne__
<
:__lt__
>
:__gt__
<=
:__le__
>=
:__ge__
So for our a > b
example we care about __gt__
. That leads us to writing the following Python code to implement the equivalent of operator.gt()
(know that debuiltins._mro_getattr()
is just a helper to look up attributes on types as Python always does for special/magic methods; it's a perf thing):
Now each comparison has a reflection so that if the left-hand side of the comparison expression doesn't implement the appropriate special method you at least have a chance at using the right-hand side to get what you. want. The pairings are:
__lt__
and__gt__
__le__
and__ge__
__eq__
and itself__ne__
and itself
What this means (roughly) is that if a > b
doesn't work then we can try b < a
. Now the data model, much like with binary arithmetic operators, has some fanciness to it when it comes to the right-hand side of the expression. If:
- The right-hand side is not the same type as the left-hand side
- But the right-hand side's is a subclass of the left-hand's type
then we try the right-hand side's way of doing things first (e.g. b < a
). The reason for this rule is just like with binary arithmetic operators: if subclasses on the right-hand side want to do something special they get a chance to. For example, if b
wanted to make sure to return an instance of itself it would only get that chance if a
did not go first, else a > b
could return an instance of a
instead of b
.
Putting this all together gets us:
If you generalize this out to the other comparisons and their reflection you have the operations work appropriately for either argument!
==
and !=
can never fail
So we have a solution for >
which can be generalized, but there's one more thing we need to contend with. In case you weren't aware, both ==
and !=
will not raise TypeError
if the special/magic methods don't (if they are even defined). Instead, Python will fall back on comparing the values of id()
for each object as appropriate.
Back in the Python 2 days, you could compare any objects using any comparison operator and you would get a result. But those semantics led to odd cases where bad data in a list, for instance, would still be sortable. By making only ==
and !=
always succeed (unless their special methods raise an exception), you prevent such unexpected interactions between objects and having silent errors pass (although some people wish even this special case for ==
and !=
didn't exist).
And with that, we get a complete implementation for rich comparisons!
__eq__
and __ne__
on object
If you look at how object
implements rich comparison, you will see it implements __eq__
and __ne__
(the other special methods for rich comparison on object
are just a side-effect of using a single C function to implement all rich comparison special methods). For __eq__
, the code does an id()
check much like the default semantics for ==
and when it succeeds it returns True
, but if the IDs differ then NotImplemented
is returned. The reason for this interesting false result is to allow the other object's __eq__
to participate in the operation, otherwise it falls through to the default semantics for ==
which eventually return False
.
For __ne__
, the data model explicitly states that "__ne__()
delegates to __eq__()
and inverts the result unless it is NotImplemented
". That lets you just override __eq__
and get changed semantics for __ne__
automatically. The NotImplemented
result has the same effect as ==
where it will let the default semantics take over for checking the IDs of the objects don't match.
As for why bother having these methods predefined on object
when the default semantics for ==
and !=
do the same thing, it still lets you call the methods directly. Something I think a lot of people don't think about is the fact that you can not only call these methods directly to skip over the syntax, but pass them around like any other object. That's handy if you're passing methods around as callbacks or something.
Conclusion
And that's it! As with the other posts in the series, you can find the source code in my desugar project.