Computer Things

Subscribe
Archives
  Back to the email
YawarRaza7349
Sep. 26, 2025, afternoon

Notably this is the first place we see “tagged” algebraic types, where both sums and products have string tags.

This is a misunderstanding of what the word "tagged" means that permeates throughout the essay. An untagged union is one where values do not "remember" the variant used to construct them. The most well-known example is C unions:

union { int i; char* s; } v{ .s = "0" };

I cannot query v to see if it was created with .i or .s. I instead would need to keep track of that separately. If you bundle that "separately" with the data itself, you get a tagged union:

struct { enum { INT, STR } tag; union { int i; char* s; } data; } v{ .tag = STR, .data = { .s = "0" } };

Now I can query .tag to see which variant it is, then access the correct member of .data based on that. Higher-level languages do this same bundling for you "under the hood" for their tagged unions.

(This is similar to how C arrays do not store the length alongside the array, but Java arrays do.)

You used "tag" to refer to the names given to each variant: i and s. But tag actually refers to the field tag. Note that it also doesn't refer to the enum labels INT and STR themselves either; tag could have type int with values 0 and 1 instead. "Tag" really refers to the extra storage needed to store the tag value in run-time memory, compared to the lack thereof in the untagged union.

The word you're looking for seems to be "labelled", used at least in the first three documents here: https://smlfamily.github.io/history/. These refer to the syntactic names that make things easier for programmers reading the code. A type like Str + Str lacks labels but still has an integer tag to distinguish Left ":(" from Right ":(" at run-time ((0, ":(") vs (1, ":(")).

Pascal did not have sum types because Wirth thought they were less flexible than untagged unions.

Pascal, in your linked paper, has both tagged and untagged unions: "discriminated" and "free" unions respectively. While Wirth thought having only tagged unions and not untagged unions was inflexible, he recommends using tagged unions when possible in the following sentence (in parentheses).


ML had the abstype keyword for naming constructed types. You could define EmailOrAddress to be Str + Str but couldn’t call the left Str “email”.

There was this though (from Milner 1978, bottom-right of page 123/PDF page 5):

absrectype α list = . + (α × α list) with nil = abslist(inl()) and $.(x,ℓ) = abslist(inr(x,ℓ)) and null(ℓ) = isl(replist(ℓ)) and hd(ℓ) = fst(outr(replist(ℓ))) and tl(ℓ) = snd(outr(replist(ℓ)))

So you could do:

absrectype contact = str + str with inemail(e) = abscontact(inl(e)) and inaddress(a) = abscontact(inr(a)) and isemail(c) = isl(repcontact(c)) and isaddress(c) = isr(repcontact(c)) and outemail(c) = outl(repcontact(c)) and outaddress(c) = outr(repcontact(c))

The built-in sum type only supported using functions like isl and outr rather than pattern matching, so custom sum types aren't any worse than the built-in one in this case.


He does not name them “sum and product types” but I think it’s believable that they are named after the corresponding disjoint union and Cartesian product set operations. I’m not going to dig into the history of those names but I’d guess they were named in analogy to arithmetic sum and product.

While this post appropriately focused on research about "real" programming languages, it'd be interesting for someone to also look into how this was discussed in fields "in between" set theory and real PLs, possibly either type theory or the theory of type systems. When I saw your BlueSky tweet, I decided to check out Martin-Löf's Pi and Sigma types, since those are obviously named after mathematical products and sums, and found out the paper came out in 1972. But interestingly, while he does use the + symbol for a binary sum, he doesn't use the word sum, but instead "disjoint union". So did theorists start calling them "sum types" before the papers you mentioned about real PLs did? (Or maybe PL theory didn't even exist in the same way it does today? IDK.)

Reply Report
Powered by Buttondown, the easiest way to start and grow your newsletter.