Use TOML for `.env` files?

We have some support for .env files in the Python extension for VS Code, but we have noticed some shortcomings with .env files based on user feedback. After getting some inspiration from CircuitPython and thinking about it a bit, I think I might have a nice solution to the problems we have seen with .env files.

What are .env files?

From what I can tell, .env files – sometimes called environment variable definitions files – come from the twelve-factor app design, and specifically the config portion. The idea is to keep any varying configuration details for your app outside of your code so you can easily control it at launch-time. By using environment variables, you can easily lean into what your operating system provides to set those configuration settings when you launch your application.

Now launching your app with a long command where you specify every environment variable upfront isn't always fun. And sometimes you want a way to specify default values for your configuration which you can override with environment variables. This is where .env files come in; by providing a file which acts as a place to write down environment variable defaults for your application, you can make environment variables just change the default settings.

The problems

The idea of .env files seems nice, and in the simple cases they are. But as we all know, software developers have a hard time keeping things simple when they get tempted by the flexibility of something. 😉 And so, there are some inherent flaws in how .env files are used today.

There is no standard

If you happen to be familiar with .env files, can you tell me what the official file format is?

That's actually a trick question because there is no standard, and thus nothing "official" about .env files. That makes reading these files rather difficult if you want to try and support however some random tool chooses to support .env files (e.g. the Python extension for VS Code reading some .env file that works with your library for reading the file which differs from how the extension reads the file). For instance, python-dotenv, which is probably the most popular library for reading .env files in Python code says:

The format is not formally specified and still improves over time. That being said, .env files should mostly look like Bash files.

In that sentence, "improves" can lead to "changes without a way to guarantee it won't break other tooling that is trying to read your .env file that isn't this library". Now I'm not blaming python-dotenv for doing this, but this is a side-effect of having no standard file format. I mean, what's "mostly look[s] like Bash files" is going to vary from one library to the next.

I think the reason this situation isn't worse than it is (at least in the Python community) is that tools like pydantic and Pipenv use python-dotenv, so it's all the same library doing the parsing. But once again that's not going to necessarily work when someone else needs to read those files and isn't using that specific Python library but instead has to use something e.g. implemented in JavaScript (👋).

Not cross-platform

What is the ASCII character used to separate search path components in an environment variable?

If you're a Unix user you probably said :, and if you use Windows you probably said ; (in Python this is stored in os.pathsep). Unfortunately, that means if you wanted to set something like PYTHONPATH in your .env file, you can only write it one way for one operating system because the search path separators are different.

So that means you have to write separate .env files for each OS. While this is doable as long as you set up your own code to read the right .env file for the OS you're running on, it does suck since it means you have to repeat everything unless the library you're using happens to also support some mechanism for saying, "use this .env file no matter what, but use this one for Windows and this other one for Unix". It's also annoying when the only difference for an environment variable happens to be the search path separator (e.g. Python can handle / path component separators on Windows, but Python can't make your OS be platform-agnostic when it comes to search path separators).

The solution?

I was reading the Adafruit Daily, Python for Microcontrollers edition and noticed it mentioned the upcoming version of CircuitPython supported defining environment variables via a settings.toml file. In that moment I realized CircuitPython had a very nice solution to the lack of file format for .env files by simply choosing TOML! That way you have a standard to back how you should parse the file, making them way more portable. Plus the format already looks like a .env file anyway thanks to the ability to specify keys and values at the top of a file without them being inside a table!

Then I started to think about the other issues we have with .env files for the Python extension for VS Code and realized that TOML could help with those issues as well. For instance, lets say you want to set PYTHONPATH in a way that was platform-agnostic; how could you do that? Well, you could treat every array in TOML as a value which should be joined using the operating systems's native search path separator! But that's only if you need to construct the value yourself (i.e. the Python extension for VS Code), otherwise if you load your .env file in your own code it would already be split appropriately since it was stored split as an array; no pythonpath.split(os.pathsep) necessary!

What about platform-specific values? You could use TOML's table support and have a [platform] table which has sub-tables for each OS, e.g. [platform.win32] or something (I'm just thinking in terms of sys.platform, but we could come up with a more standardized way to name the various platforms). That would let you define the OS-agnostic environment variable values at the top of your TOML file outside of any specific table and then separate out the platform-specific ones into [platform] sub-tables!

And then you could take the table idea farther and have a table for a specific [purpose] 😉. You could have a [purpose.test] or [purpose.production], all without having to use separate files where you may accidentally leave out a common setting that every .env file needs to define for your application. (I'm also a fan of less configuration files, not more.)

And then if you continue down this route of improvement, you can also standardize on the syntax for environment variable substitution. As a Python developer I'm partial to {}, but ${} would also work if you wanted to be consistent. But this would be an opportunity to make sure that everyone reads these files consistently and handles things such as variable substitution, regardless of the library they are using to do the processing.

Is this worth pursuing?

Is this a good idea or a bad idea? Do people like the idea of a e.g. .env.toml (or env.toml or .env with an assumption of TOML syntax)? Is this worth trying to get various tools to support?