How virtual environments work
After needing to do a deep dive on the
venv module (which I will explain later in this blog post as to why), I thought I would explain how virtual environments work to help demystify them.
Why do virtual environments exist?
my the day, there was no concept of environments in Python: all you had was your Python installation and the current directory. That meant when you installed something you either installed it globally into your Python interpreter or you just dumped it into the current directory. Both of these approaches had their drawbacks.
Installing globally meant you didn't have any isolation between your projects. This led to issues like version conflicts between what one of your projects might need compared to another one. It also meant you had no idea what requirements your project actually had since you had no way of actually testing your assumptions of what you needed. This was an issue if you needed to share you code with someone else as you didn't have a way to test that you weren't accidentally wrong about what your dependencies were.
Installing into your local directory didn't isolate your installs based on Python version or interpreter version (or even interpreter build type, back when you had to compile your extension modules differently for debug and release builds of Python). So while you could install everything into the same directory as your own code (which you did, and thus didn't use
src directory layouts for simplicity), there wasn't a way to install different wheels for each Python interpreter you had on your machine so you could have multiple environments per project (I'm glossing over the fact that back in
my the day you also didn't have wheels or editable installs).
Enter virtual environments. Suddenly you had a way to install projects as a group that was tied to a specific Python interpreter. That got us the isolation/separation of only installing things you depend on (and being able to verify that through your testing), as well has having as many environments as you want to go with your projects (e.g. an environment for each version of Python that you support). So all sorts of wins! It's an important feature to have while doing development (which is why it can be rather frustrating for users when Python distributors leave
How do virtual environments work?
conda run). This is why you are always expected to activate a conda environment, as some conda packages require those shell scripts to be run. I won't be covering conda environments in this post.
There are two parts to virtual environments: their directories and their configuration file. As a running example, I'm going to assume you ran the command
py -m venv --without-pip .venv in some directory on a Unix-based OS (you can substitute
py with whatever Python interpreter you want, including the Python Launcher for Unix).
A virtual environment has 3 directories and potentially a symlink in the virtual environment directory (i.e. within
X.Yis the Python version (
libif you're using a 64-bit build of Python that's on a POSIX-based OS that's not macOS
The Python executable for the virtual environment ends up in
bin as various symlinks back to the original interpreter (e.g.
.venv/bin/python is a symlink; Windows has a different story). The
site-packages directory is where projects get installed into the virtual environment (including
pip if you choose to have it installed into the virtual environment). The
include directory is for any header files that might get installed for some reason from a project. The
lib64 symlink is for consistency on those Unix OSs where they have such directories.
The configuration file is
pyvenv.cfg and it lives at the top of your virtual environment directory (e.v.
.venv/pyvenv.cfg). As of Python 3.11, it contains a few entries:
home(the directory where the executable used to create the virtual environment lives;
include-system-packages(should the global
site-packagesbe included, effectively turning off isolation?)
version(the Python version down to the micro version, but not with the release level, e.g.
3.12.0, but not
executable(the executable used to create the virtual environment;
command(the CLI command that could have recreated the virtual environment)
On my machine, the
pyvenv.cfg contents are:
One interesting thing to note is
pyvenv.cfg is not a valid INI file according to the
configparser module due to lacking any sections. To read fields in the file you are expected to use
line.partition("=") and to strip the resulting key and value.
And that's all there is to a virtual environment! When you don't install
pip they are extremely fast to create: 3 files, a symlink, and a single file. And they are simple enough you can probably create one manually.
One point I would like to make is how virtual environments are designed to be disposable and not relocatable. Because of their simplicity, virtual environments are viewed as something you can throw away and recreate quickly (if it takes your OS a long time to create 3 directories, a symlink, and a file consisting of 292 bytes like on my machine, you have bigger problems to worry about than virtual environment relocation 😉). Unfortunately, people tend to conflate environment creation with package installation, when they are in fact two separate things. What projects you choose to install with which installer is actually separate from environment creation and probably influences your "getting started" time the most.
How Python uses a virtual environment
During start-up, Python automatically calls the
site.main() function (unless you specify the
-S flag). That function calls
site.venv() which handles setting up your Python executable to use the virtual environment appropriately. Specifically, the
- Looks for
pyvenv.cfgin either the same or parent directory as the running executable (which is not resolved, so the location of the symlink is used)
- Looks for
pyvenv.cfgto decide whether the system
site-packagesends up on
homeis found in
sys._homeis used by
That's it! It's a surprisingly simple mechanism for what it accomplishes.
One thing to notice here about how all of this works is virtual environment activation is optional. Because the
site module works off of the symlink to the executable in the virtual environment to resolve everything, activation is just a convenience. Honestly, all the activation scripts do are:
- Puts the
Scripts/) directory at the front of your
VIRTUAL_ENVto the directory containing your virtual environment
- Tweaks your shell prompt to let you know your
PATHhas been changed
- Registers a
deactivateshell function which undoes the other steps
In the end, whether you type
python after activation or
.venv/bin/python makes no difference to Python. Some tooling like the Python extension for VS Code or the Python Launcher for Unix may check for
VIRTUAL_ENV to pick up on your intent to use a virtual environment, but it doesn't influence Python itself.
In the Python extension for VS Code, we have an issue where Python beginners end up on Debian or a Debian-based distro like Ubuntu and want to create a virtual environment. Due to Debian removing
venv from the default Python install and beginners not realizing there was more to install than
python3, they often end up failing at creating a virtual environment (at least initially as you can install
python3-venv separately; in the next version of Debian there will be a
python3-full package you can install which will include
pip, but it will probably take a while for all the instructions online to be updated to suggest that over
python3). We believe the lack of
venv is a problem as beginners should be using environments, but asking them to install yet more software can be a barrier to getting started (I'm also ignoring the fact pip isn't installed by default on Debian either which also complicates the getting started experience for beginners).
venv is not shipped as a separate part of Python's stdlib, so we can't simply install it from PyPI somehow or easily ship it as part of the Python extension to work around this. Since
venv is in the stdlib, it's developed along with the version of Python it ships with, so there's no single copy which is fully compatible with all maintained versions of Python (e.g. Python 3.11 added support to use
sysconfig to get the directories to create for a virtual environment, various fields in
pyvenv.cfg have been added over time, use new language features may be used, etc.). While we could ship a copy of
venv for every maintained version of Python, we potentially would have to ship for every micro release to guarantee we always had a working copy, and that's a lot of upstream tracking to do. And even if we only shipped copies from minor release of Python, we would still have to track every micro release in case a bug in
venv was fixed.
Hence I have created microvenv. It is a project which provides a single
.py file which you use to create a minimal virtual environment. You can either execute it as a script or call its
create() function that is analogous to
venv.create(). It's also compatible with all maintained versions of Python. As I (hopefully) showed above, creating a virtual environment is actually straight-forward, so I was able to replicate the necessary bits in less than 100 lines of Python code (specifically 87 lines in the 2023.1.1 release). That actually makes it small enough to pass in via
python -c, which means it could be embedded in a binary as a string constant and passed as an argument when executing a Python executable as a subprocess if you wanted to (directly executing
microvenv.py works). Hopefully that means a tool could guarantee it can always construct a virtual environment somehow.
microvenv simple, small, and maintainable, it does not contain any activation scripts. I personally don't want to be a shell script expert for multiple shells, nor do I want to track the upstream activation scripts (and they do change in case you were thinking "it shouldn't be that hard to track"). Also, in VS Code we are actually working towards implicitly activating virtual environments by updating your environment variables directly instead of executing any activation shell scripts, so the shell scripts aren't needed for our use case (we are actively moving away from using any activation scripts where we can as we have run into race condition problems with them when sending the command to the shell; thank goodness of
conda run, but we also know people still want an activated terminal).
I'm also skipping Windows support because we have found the lack of
venv to be a unique problem for Linux in general, and Debian-based distros specifically.
I honestly don't expect anyone except tool providers to use
microvenv, but since it could be useful to others beyond VS Code, I decided it was worth releasing on its own. I also expect anyone using the project to only use it as a fallback when
venv is not available (which you can deduce by running
py -c "from importlib.util import find_spec; print(find_spec('venv') is not None)"). And before anyone asks why we don't just use
virtualenv, its wheel is 8.7MB compared to
microvenv at 3.9KB; 0.05% the size, or 2175x smaller. Granted, a good chunk of what makes up
virtualenv's wheel is probably from shipping
setuptools in the wheel for fast installation of those projects after virtual environment creation, but we also acknowledge our need for a small, portable, single-file virtual environment creator is rather niche and something
virtualenv currently doesn't support (for good reason).
Our plan for the Python extension for VS Code is to use
microvenv as a fallback mechanism for our Python: Create Environment command (FYI we also plan to bootstrap
pip via its
pip.pyz file from bootstrap.pypa.io by downloading it on-demand, which is luckily less than 2MB). That way we can start suggesting to users in various UX flows to create and use an environment when one isn't already being used (as appropriate, of course). We want beginners to learn about environments if they don't already know about them and also remind experienced users when they may have accidentally forgotten to create an environment for their workspace. That way people get the benefit of (virtual) environments with as little friction as possible.