Unravelling assignment expressions

As part of my series on Python's syntactic sugar, I initially skipped tackling assignment expressions because I made them more complicated than I needed to in my head. 😅 But there is some key subtlety to unravelling them which may not be obvious.

Let's start with a simple example of an assignment expression inside of a function call:

a(b := c(), b+1)
Example Python code of an assignment expression inside of a function call

Now naively you might think you can just lift that assignment expression out to be an assignment statement and things would just work, like so:

b = c()
a(b, b+1)
Naively moving the assignment expression out of the function call's parameter list

Unfortunately this doesn't work for more complicated examples. For instance, if you changed a to a.d, that naive approach no longer works. Remember that attribute access executes code, so you can't guarantee that a.d doesn't have a side-effect that would influence the outcome of c(). So to do this properly you need to break down the expression into each constituent step, all executed in the proper order. Essentially you have to follow each step that the CPython interpreter takes when pushing something on to its stack when it operates.

_a = a
b = c()
_a(b, b+1)
Unravelling the assignment expression properly

Luckily this solution partially handles the scoping for assignment expressions in the face of generator expressions. Specifically, assignment expressions inside of generator expressions (and thus comprehensions) are expected to leak out of their generator expression scope to the enclosing scope. That means any((comment := line).startswith('#') for line in lines) actually has comment accessible outside of the generator expression.

comment = _UNASSIGNED
def _gen_exp(lines):
    nonlocal comment
    for line in lines:
        comment = line
        yield comment.startswith('#')

any(_gen_exp(iter(lines))
Unravelling of a generator expression with an assignment expression

Unfortunately there is a complication here when the assignment expression variable is never executed. In that case, a NameError is supposed to be raised when trying to use the variable. To do that right you would need to check if the variable was set to some initial sentinel value before first access, raising NameError if the sentinel is found, else moving on otherwise. It's a pain and leads to more unravelling code, but it should still lead to the appropriate exception being raised. For instance, if you were to do print(comment) for the example, you would need to unravel it to:

if comment is _UNASSIGNED:
    raise NameError("NameError: name 'comment' is not defined")
print(comment)
Handling NameError appropriately

What this all means is to make the unravelling as safe and thorough as possible, you essentially need to translate Python code to SSA form which is not the most readable. 😅 But it's at least doable and gets the appropriate result.