Why all([]) is true, prod([]) is 1, etc
It's all monoids! Monoids for everyone!
New Blogpost
The World and the Machine. It's about how to model systems better, and while it's written in TLA+ the advice should apply regardless of your speclang. Patreon is here.
Speaking of TLA+, I'm running a TLA+ workshop on Feb 12th! I'm real excited about this one because I wrote a new lesson on the TLA+ collection types which should make the whole process a bit smoother. Use the code NEWSLETTERDISCOUNT
for $50 off.
Why all([]) is true, prod([]) is 1, etc1
This is inspired by this article. Some Python list-ops do unusual things with empty lists:
>>> all([])
True
>>> any([])
False
>>> math.prod([])
1
>>> sum([])
0
You can find in-depth explanations for each of these by itself, like the empty product. But all of these can be explained as consequences of the same mathematical principle. Let's say we have two lists xs
and ys
, and let +
be list contatenation. We expect the following to always hold:
prod(xs + ys) = prod(xs) * prod(ys)
IE prod([2,4,6,8]) = prod([2, 4]) * prod([6, 8])
.2 You can see why we'd want this to be true, as it means that prod
behaves in a reasonable way regardless of how big our list is. But we also have an edge case:
prod(xs)
= prod(xs + [])
= prod(xs) * prod([])
So for any xs
we must have prod(xs) = prod(xs) * prod([])
, which is only possible if prod([]) = 1
. We can apply similar reasoning for other list reductions:
sum(xs) = sum(xs) + sum([]) = sum(xs) + 0
all(xs) = all(xs) and all([]) = all(xs) and True
any(xs) = any(xs) or any([]) = any(xs) or False
Simple, right?
But what about max
?
Of course there's a problem. max(xs + ys)
is obviously max(max(xs), max(ys))
. So by the same reasoning we should have
max(xs) = max(max(xs), max([]))
But max([])
can't be any number. Let's say we decide it's 0, then we have
max([-1]) = max(max([-1]), 0) = 0
We can't define the maximum of an empty list! Then what's special about prod
that makes an empty product possible when an empty max is not?
What's "special" is that prod
has an identity: some value a
such that x * a = a * x = x
(1, for prod
). The mathematical term for "a set of elements with an associative binary operation that has an identity" is a monoid. For any monoid, we can take the operation and create a version that applies it to a list, and then applying that to the empty list should return the identity:
listop(xs)
= listop(xs) op listop([])
= listop(xs) op iden
= listop(xs)
The set of booleans form a monoid under or
, which has the identity element False
, so any([]) = False
. max
does not have an identity, so is not a monoid, so we can't meaningfully define it for an empty list.3
"A version that applies it to a list"...
...is just reduce
. prod(xs)
is reduce(*, xs, 1)
, any
is reduce(or, xs, False)
, etc. Most languages require you to provide an initial value for the reduce (ie reduce(max, xs, xs[0])
, but if you're reducing over a monoid then you shouldn't actually need it, as you can provide the identity as the default initial value.
And in fact this is (sort of) what Haskell does. In Haskell, foldl/foldr
over lists require a default value, but if you know you're working over a monoid, you can use mconcat
without one, as it can use the identity as its default.
>>> mconcat [1,2,3] :: Product Int
Product {getProduct = 6}
>>> mconcat [] :: Product Int
Product {getProduct = 1}
Some other examples of monoids:
- Functions form a monoid under function composition, with identity
f(x) = x
. - NxN matrices under matrix multiplication, with identity "the NxN identity matrix".
- Strings under string concatenation, with identity
""
. - Dictionaries under dict-merge (left or right), with identity
{}
.
For all of these you shouldn't need to define a default value when reducing in principle, though getting a non-Haskell language to automatically determine that you're reducing over a monoid is more trouble than it's worth.
I think it's interesting that most of those examples are collection types and not primitives. Probably just because there are more collections than primitives.
-
The original title was "why
any([])
is true", which was a dumb typo on my part. As the body of the essay discusses,any([])
is false. I meant "all([])
is true" and fixed the title of the post. ↩ -
The technical term for this is that product should be homomorphic. An example of a non-homomorphic function would be
mean
:mean(xs + ys) != mean(mean(xs), mean(ys))
↩ -
At least not over a normal set of numbers. There's an identity if you're doing modular arithmetic, in which case you're fine defining the empty max. Also no less than five readers pointed out that you can use -infinity as the identity for max. ♥ u guyz ↩
If you're reading this on the web, you can subscribe here. Updates are once a week. My main website is here.