Unravelling data structure displays

The title of this next post in my series on Python's syntactic sugar may seem odd: what's a "display" when it comes to data structures? It turns out that's the technical term for what most folks would consider the literal form of lists, sets, and dictionaries. Comprehensions are part of a data structure's display, but that was covered in a previous post. For this post I will be covering what you put between the square brackets and curly braces to make lists, sets, and dictionaries directly.

List displays

The language reference for list displays says what you think: stuff between [ and ], separated by commas, becomes a list. But how can we construct a list without using any syntax when each member of that list has been written out? Tuple and list() to the rescue!

The docs on list state that it can take an iterable and return a new list populated with the items from the list, otherwise it returns an empty list. That would suggest that we can translate [1, 2, 3] to list((1, 2, 3)) and [] to list().

We can re-implement list.__init__() to see what this looks like as Python code:

class list(builtins.list):

    """An implementation of list()."""

    def __init__(self, iterable=_NOTHING, /) -> None:
        if iterable is not _NOTHING:
            for item in iterable:
An implementation of list.__init__()

Leaning on tuples as a way to write out the literal and then translating the tuple to the appropriate type via it's constructor is a trick we are going to use throughout this post.

Set displays

The language reference entry on set displays is very similar to the one for lists. That also means that the translation is simple: {1, 2, 3} becomes set((1, 2, 3)) (set() is already the way to make an empty set so that's taken care of for us).

Implementing set.__init__() for set is also very similar to list.__init__():

class set(builtins.set):

    """An implementation of set()."""

    def __init__(self, iterable=_NOTHING, /) -> None:
        if iterable is not _NOTHING:
            for item in iterable:
Implementation of set.__init__()

Dict displays

Dict displays from the language reference don't hold any surprises. Everything is evaluated left-to-right which is important as insertion order of dictionaries is preserved since Python 3.7.

The next question is what does dict do in dict.__init__()? There are three ways to populate a dictionary via dict.__init__(), two of which are exclusive. If a positional argument is provided it is checked to see if it has a keys() method. If it does, then the object is considered a map. The iterable from keys() is used to get the values from the map and add them to the dictionary. Without a keys() method the positional argument is considered an iterable of key/value pairs. Regardless of the positional argument, any keyword arguments are used to update the dictionary.

def __init__(self, iterable_or_mapping=_NOTHING, /, **kwargs) -> None:
        if iterable_or_mapping is not _NOTHING:
            if hasattr(iterable_or_mapping, "keys"):
                mapping = iterable_or_mapping
                for key in mapping.keys():
                    self[key] = mapping[key]
                iterable = iterable_or_mapping
                for key, val in iterable:
                    self[key] = val

Implementation of dict.__init__()

With that, we can translate dictionaries into key/value tuples: {'a': 1, 'b': 2} becomes dict((('a', 1), ('b', 2))). We can't use the keyword arguments since there's no guarantee the keys to the dictionary will be strings.

Tuple displays

One thing that is common about all of the unravellings I did above is they rely on tuples as the bottom data structure that everything devolves into before getting passed into the requisite type. But is there a way to unravel even tuples?

Tuples are written using parentheses and at least one comma (and since the parentheses are used for more than one thing I suspect that's why the language reference doesn't have a concept of "tuple displays"). We just gave up using the syntax for other data structures, so we can't fall back on lists to unravel tuples (e.g. (1, 2, 3) can't be written as tuple([1, 2, 3]) since that list becomes a tuple itself using the technique above). But what if we used a different bit of Python to get ourselves a tuple?

Lambdas to the rescue! Did you know that *args is defined by the language reference for functions to be a tuple? Thanks to that and lambdas being an expression, we can exploit that to get us a tuple without using parentheses to explicitly denote a tuple! So (1, 2, 3) becomes (lambda *args: args)(1, 2, 3)!

This then plays into our other data structures covered in this post as well:

  • [1, 2, 3] becomes list((lambda *args: args)(1, 2, 3))
  • {1, 2, 3} becomes set((lambda *args: args)(1, 2, 3))
  • {'a': 1, 'b': 2} becomes dict((lambda *pairs: pairs)((lambda *first_pair: first_pair)('a', 1), (lambda *second_pair: second_pair)('b', 2)))

Iterable and dictionary unpacking

One last detail when it comes to displays is iterable and dict unpacking. This is when you unpack something in-place as part of a display:

  • [1, 2, *[3, 4]]
  • {1, 2, *{3, 4}}
  • (1, 2, *(3, 4))
  • {'a': 1, 'b': 2, **{'c': 3, 'd': 4}}

Luckily all 4 possibilities lead to a new instance of the appropriate data structure, so there's no in-place updating that needs to be supported. That would have made things tricky as the intermediate data structure would needed to have been stored somehow. The one issue we do have is what is being unpacked can be an aribtrary expression, so you can't simply make a longer display, e.g. making [1, 2, *[3, 4]] into list((1, 2, 3, 4)) isn't always possible.

But two details do make handling unpacking feasible. One is that every constructor takes an iterable as we have shown. And two, all of the data structures have some operator support for combining two instances together (as of Python 3.9 in the case of dictionaries). That turns out to be enough to unravel unpacking!

  • [1, 2, *[3, 4]] becomes list((1, 2)) + list((3, 4))
  • {1, 2, *{3, 4}} becomes set((1, 2)) | set((3, 4))
  • (1, 2, *(3, 4)) becomes (lambda *args: args)(1, 2) + (lambda *args: args)(3, 4)
  • {'a': 1, 'b': 2, **{'c': 3, 'd': 4}} becomes dict((('a', 1), ('b', 2))) | dict((('c', 3), ('d', 4)))