Unravelling the import statement
As part of my series on Python's syntactic sugar, I am going to cover import statements. This will include delving into the quirky interface of __import__()
(although in actual code you should use importlib.import_module()
).
What this post will not cover, though, is how imports work beyond the syntactic sugar (e.g. there will be no discussion of how sys.path
plays into things). If you want to know how the import system works, I have given that talk at least twice (with slides), and you can always dig into the importlib docs example on how import works and dive into importlib's source code since it is the implementation of the import system itself (which is written in Python).
import ...
Let's start simple: import a
. What this statement does is it imports the module a
and assigns it to the variable name a
. Unravelling the syntactic sugar, the code becomes:
One thing in this call that might seem odd are the calls to globals()
and locals()
. In this specific case the information isn't needed, but the actual bytecode used to implement import
always passes it in, so we will do so as well (the motivation behind why this is ever done will be made clearer later on).
But if we toss in a submodule, like with import a.b
, the quirks of __import__()
start to show up:
You will notice that while we specify that we are importing "a.b"
, we are still assigning to just a
. That's because we need to make the attribute access of .b
off of a
still work, so while __import__()
makes sure that a.b
exists, it only returns a
.
from ... import ...
Let's consider from a.b import c
. In this case the syntax devolves to:
The first thing to notice, compared to our import a
example, is the ['c']
argument. Providing that list does two things; it makes sure that a.b
has a c
attribute, and it makes __import__()
return a.b
(not a
like you might have expected based on our previous example). The reason for the shift of what is returned is that at the bytecode level it makes things simpler as the bytecode doesn't have to go from a
to a.b
in order to get c
; it can just work directly off of the returned object to get c
.
As such, the second thing to notice is the attribute access of c
tacked on at the end of the __import__()
call. Since we got back a.b
, we still need to access a.b.c
to assign it to c
locally.
Now you may be wondering how relative imports are handled? Well, it essentially involves counting and that globals()
call you keep seeing. When using from ..a import b
, it becomes:
The leading dots of the relative import get counted and passed in as the last argument to __import__()
(in this case, 2
). This is also when globals()
comes into play as it is checked for __spec__.parent
to resolve the relative module name (see importlib.util.resolve_name()
and its implementation for details). (Aside: the rest of globals()
isn't used and none of locals()
is used; I think there was thoughts of flexibility when the API was created by passing in the entire global namespace as well as tossing in the local namespace in case either was useful to someone somehow in the future.)
... as ...
The last variant of the import
statement is when there's an as
clause. In general, all it really does for us is change what variable name gets assigned. So for import a as b
, it's just like the above but with an assignment to b
instead of a
:
Same goes for from a.b import c as d
:
But where things get interesting is when you import a submodule without using a from
clause, e.g. import a.b as c
. You can't just use __import__('a.b', globals(), locals())
since we only get back a
. Now we could tack on an attribute access of .b
after the call, but another approach is to realize the end result of import a.b as c
is equivalent to from a import b as c
, which we already know how to do as shown above.
Conclusion
We covered how to pull back the syntactic sugar of import ...
, from ... import ...
, from ... import ... as ...
, and import ... as ...
. As usual, the code for all of this is in my desugar project.