Tagged unions are overrated

organize

                February 15, 2021

            Tagged unions are overrated

            Among engineers who have strong opinions about programming languages, one particularly widely-held take is in the value of tagged unions, as well as language support for pattern-matching over them. When we were deciding to write Sorbet in C++, for instance, we often heard shock and surprise that we would even consider using a language without native support for tagged unions and pattern-matching. Conversely, OCaml’s excellent support for both is often credited as one of the reasons it is great for writing compilers.
At this point I’ve worked on a number of compilers, both toy and production, in a wide range of languages, including Go, C++, Java, Rust, and Haskell. With that experience, my considered opinion is this: Tagged unions and pattern-matching on them are vastly overrated for implementing language ASTs and IRs. They are nice to have, but they rarely make a substantial difference in the development of a compiler, typechecker, or similar.
Emulation
Tagged unions are, in fact, a very natural way to organize your IR. They map very naturally to the structures of most IRs, which have a finite universe of disjoint possible node types, and tend to have a lot of code that iterates over them and behaves differently according to the concrete types.
However, the crux of my argument is that it’s easy enough to emulate this construction in other languages. In object-oriented languages like Java or C++, you can implement your AST as an abstract base class and a set of subclasses, and do matching using straightforward type tests — instanceof in Java, or dynamic_cast in C++. In Go, you can use the private interface method trick, or even just use interface{} in some cases, and use type switches for dispatch.
I don’t claim that these approaches are as ergonomic as native pattern-matching, but in my experience they’re good enough for the distinction to be a non-issue. Writing out the ifs can sometimes take a bit more code, but it doesn’t fundamentally alter the shape of your system, and you rarely lose all that much meaningful safety.
I do think it’s very valuable to have been exposed to the idea of tagged interfaces and pattern-matching when designing these systems. To pick a concrete example, the visitor pattern is hot garbage in 95% of cases, and would be better replaced with straightforward recursive descent using instanceof; and I believe that replacement is much more natural if you think of your code as emulating pattern-matching over tagged unions, and not in some object-oriented mindset where you view if statements and type tests as inherently sinful somehow.
But you don’t actually need pattern-matching and tagged interfaces to get there; Seeing the pattern and then emulating it works Just Fine™.
Limits of pattern-matching
My next comment is on the flip side — the limits of pattern-matching in languages that do support it. In my experience, it often ends up being less powerful and convenient than you’d like. To pick some concrete examples:
In Rust, pattern matching breaks down as soon as your IR contains pointers to its children, instead of containing them inline or in a Box (and pattern matching on Box is still unstable). It’s very common in my experience for IRs to use Rc or Arc in order to support transformations that share IR subtrees between nodes, and in Rust, once you do that, you suddenly lose all ability to do nested pattern-matches, anyways! This quibble is specific to Rust, but it’s emblematic of a broader class of issue, which is that pattern-matching isn’t (in most languages) extensible or customizable in any powerful way.
Another way this manifests is that it’s often frustrating that patterns aren’t first-class; I can’t pass around a pattern, or build up a pattern on the fly. LLVM has a home-built pattern matching framework that uses A Lot Of C++ in order to provide some pattern-matching ergonomics in C++. One neat feature it has is that patterns are first-class objects, which can be built up from sub-patterns, passed around, and manipulated. It also supports matchers, like [m_OneUse](https://llvm.org/doxygen/namespacellvm_1_1PatternMatch.html#aca6428e7fec51ba0b55594f469eb9691), which match on properties other than type, in a first-class way.
To me, this is a compelling argument that, even if C++ had language-level pattern-matching, it would be necessary to invent our own pattern-matching system to support these complex and non-type-specific matchers, anyways. So what would we really be gaining from that language support?
Tagged unions in other contexts
I want to pause again to emphasize that this opinion is scoped: I think tagged unions are overrated for implementing IRs. I’m a huge fan of tagged unions for other uses in languages, and I do wish that they were more ubiquitous. I just don’t think they are killer features for compilers and similar programs which operate over source code.
I want to call out a few places in particular where I think tagged unions really punch above their weight class, in terms of supporting ergonomics and/or safety:
Result types
I love Rust’s Result<T, E> type. I think it’s absolutely my favorite model of error handling. Go’s (SomeType*, error) two-value returns can be seen as an attempt to emulate it, and I think that the being able to fold those two into a single value and pass them around is great. I often find myself, in Go, typing type fooResult struct { foo *Foo; err error } so that I can pass a result-and-error tuple through a channel, and I would love better support for that case, as well as for pattern-matching errors and not accidentally using one arm of the union in the wrong place.
Exhaustivity
The second place where I think first-class tagged unions and pattern matching really shine, is in one of the unique advantages a compiler can bring to the table which is really hard to emulate: exhaustivity checking. It’s very powerful to be able to statically assert that you’ve considered all possibilities, especially in large code bases where the team who owns the definition of a data type may not be the same as the team(s) who interact with it. If you need to add a new case, it’s very powerful for the compiler to be able to automatically tell you all of the sites where you might need to update the code in order to consider the new arm.
I do occasionally miss this feature in the compiler context, too; if you do have to add a new AST node, and your codebase has emulated pattern-matching using if  statements, it can be frustrating having to resort to grep and/or test suites to try to find all the places you might need to update the code. However, once again, I just haven’t found it to be all that onerous in practice, and, also, compilers tend to be very amenable to testing (including sophisticated testing strategies like fuzzing and property-based testing), which can find mistakes before they go to production.

Don't miss what's next. Subscribe to Musings on computer systems: