Unravelling `break` and `continue`
I have previously unravelled for
loops, and so the concept of looping has already come up in this blog post series of removing the syntactic sugar from Python. But one aspect of looping that I didn't touch upon is that of break
and continue
. Both are statements used to control the flow within a loop, whether it's to leave or jump back to the top of the loop, respectively.
How the bytecode does it
CPython's interpreter has the ability to jump around to various opcodes. That ability is what allows for break
and continue
to work. Take the following example (whose print
call is there just to have a marker for the end of the loop body):
If you disassemble that for
loop you end up with:
The bytecode at offset 14 is for break
and offset 20 is for continue
. As you can see they are JUMP_ABSOLUTE
statements, which means that when the interpreter runs them it immediately go to the bytecode at those offsets. In this instance break
jumps to the end of the function and continue
jumps to the top of the for
loop. So the bytecode has a way to skip over chunks of code.
How we are going to do it
So how do we do something similar without using those two statements? Exceptions to the rescue! In both instances we need some form of control flow that lets us jump to either the beginning or right after a loop. We can do that based on whether we put the loop inside or outside of a try
block.
For break
, since we want to jump just passed the end of the loop, we want to put the loop inside of a try
block and raise an exception where the break
statement was. We can then catch that exception and let execution carry us outside of the loop.
Handling continue
is similar, although the try
block is inside the loop this time.
Thanks to the end of the try
block for continue
extending to the bottom of the loop, control flow will just naturally flow back to the top of the loop as expected.
And a nice thing about this solution is it nests appropriately. Since Python has no way to break out of multiple loops via a single break
statement (some languages allow this by letting you label the loop and having the break specify which loop you're breaking out of), you will always hit the tightest try
block that you're in. And since you only need one try
block per loop for an arbitrary number of break
and continue
statements, there's no concern of getting it wrong. And this trick is also the idiomatic way to break out of nested loops in Python, so there's already precedent in using it for this sort of control flow.
Bringing else
clauses into the mix
This also works nicely for else
clauses on for
and while
loops as they simply become else
clauses on the try
block! So this:
becomes:
It's literally just a move of the entire clause from one statement to another!