Function Sets and Specification
May TLA+ Workshop
Still three slots left! May 24-26. Learn how to find bugs that would slip right past types, test, and code review.
No Newsletter Next Week
Running the April workshop. Might still be a bonus newsletter, I dunno
Oh hey, now you can give me money
Buttondown just added a pay-what-you-want feature! I think you can activate it by going to the newsletter page and reentering your email.
I don't have any plans to separate premium and regular content, or switch from pay-what-you-want to a fixed fee, but that might change if I can keep up the pace of a bonus email per week. I've hit the point where I have a lot more writing ideas than I can afford to spend time on, and getting paid to write changes the calculus a bunch.
Function Sets
Because formal specifications aren't programs, they can do some cool things that wouldn't make sense in a programming language. One example I like from TLA+ is function sets.
Note, because this about the idea in general, I'm simplifying the TLA+ syntax. If you know TLA+, you know how to translate these into correct syntax.
In TLA+, functions are values. For each type of TLA+ value, there's a way to generate a set of those values. For functions, it's with ->
: [A -> B]
is the set of all functions with domain A and a range in B. For example, [(Int, Int) -> Int]
is the set of all functions that take two integers and return an integer. Addition, multiplication, and subtraction are all in this set. It would also include things like constant functions, min and max, and noise functions.
Because it's a set, I can filter and map over it. Here's a filter:
{f \in [(Int, Int) -> Int]:
forall x, y \in Int: f[x, y] = f[y, x]
}
That's the set of all the commutative 2-arity functions over integers. That would include addition and multiplication but not subtraction. I can also pick out functions with specific properties from the set:
CHOOSE f \in [(Int, Int) -> Int]:
f[2, 3] = 0
We can't actually model check this, as there are an infinite number of functions in the set.1 Even if we restrict it to finite sets, the size of a function set blows up quickly. [1..10 -> {a, b, c, d}]
has 4¹⁰ members. When specifying, we normally restrict ourselves to small function sets for the sake of tractability.
So how do we use this, anyway? Lots of ways! Almost all complex data is constructed out of functions, so function sets give us ways to represent sets of complex data. Here are some examples:
Multiple Possibilities
Imagine we have a set of servers and a set of regions, where each server is associated with a region. I want to run the spec over all possible assignments of servers to regions. I can represent the config as a function from Servers to Regions, meaning that all possible combinations is a function set:
variable
config \in [Servers -> Regions];
The set of starting configurations might not be the space of all configurations, though. Maybe I can move servers between regions, but want to start with all regions having at least one server. This is another place [Servers -> Regions]
being a function set comes in handy. As before, I can filter it:
variable
config \in
{sr \in [Servers -> Region]:
forall r \in Region:
exists s \in Servers: sr[s] = r}
If we also wanted to store the state of the server, say "online" or "offline", we could do that too:
online \in [Servers -> BOOLEAN];
This is also good for pulling arbitrary data structures. Like you can define the set of all directed graphs as [(Node, Node) -> BOOLEAN]
, or the set of all queues with four messages in them as [1..4 -> Message]
. We can even use function sets to define other function sets. Like if each server also started with a queue of four messages, we could write that as
queue \in [Servers -> [1..4 -> Message]];
Type Invariants
Lots of "types" are equivalent to set membership: x
has type Integer if it has a value in the set of all Integers. Then we can make function sets into "types", too: f
has type [A -> B]
if it's a member of that function set. TLC is smart about checking membership of infinite function sets. While we can't enumerate [Thread -> Int]
, we can use it in a membership check:
INVARIANT thread_priority \in [Thread -> Int]
Of course, there's nothing wrong with defining a function set over a narrower range to make stronger invariants. If the thread priority can't be greater than the number of threads:
INVARIANT thread_priority \in [Thread -> 0..#Thread]
Where #Thread
is the number of elements in the set. We can further restrict the type by filtering the function set:
INVARIANT thread_priority \in
{tp \in [Thread -> 0..#Thread]:
#Domain(tp)) = #Range(tp))
}
If two threads are ever assigned the same priority, the model checker immediately raises the error.
Helper operators
It's easy to write an IsSorted
operator. Writing a Sort
operator is tricky, though: what does it mean to sort a list? One common way is to say it means selecting the permutation of the list which is sorted:2
def Sort(list):
CHOOSE x \in PermutationsOf(list):
IsSorted(x)
But then how do we generate the permutations? Function sets to the rescue!
def PermutationsOf(list):
map(LAMBDA p: ApplyPermutation(p, list),
PermutationVectors(#list))
def PermutationVectors(n):
{p \in [1..n -> 1..n]: Range(p) = 1..n}
[1..n -> 1..n]
is the set of all "index maps": vectors we apply to list
to give us a same-length list using only the elements of list
. By filtering the range to be 1..n
, we select only the index maps that don't map two numbers to the same index. That leaves just the permutation vectors.
50 words to explain a 40-character operator. That's how powerfully expressive function sets are. To be clear, I'm writing idiomatic TLA+ (after transforming from my "simplified syntax"). That's the normal, expected way to get the set of all permutation vectors. Everybody knows what it means. I needed to explain it not because it's "advanced", but because it's a language construct that doesn't exist outside of specification languages.
Programming vs Specification
Then why don't function sets exist in mainstream programming languages? Recall our three uses:
- Multiple configurations: This is useful in specs because we want to represent the possible initial configurations of the system to see if any of them evolve into an error. In a program, we have a single actual initial configuration we receive, and a single behavior that evolves from that. We're not fully in control of which possible configuration we receive or which behavior evolves over the course of our program, but it's still only one of each. Function sets don't help us.
- Invariants: Properties aren't part of execution of the program, they are statements about the execution. In programming languages, this is pushed to types, tests, maybe contracts if you're lucky. Function sets could be useful for typechecking, except that tractable typechecking is done constructively, not predicatively.
- Utilities: Mathematical definitions of operations don't include things like "efficiency" or "memory-usage". That's why we have algorithms. Don't sort a list by generating all of its permutations; use a damn quicksort.
Function sets are incredibly useful for specification, but not for programming. Specifications and software have different requirements, leading to different language constructs. Goes both ways, of course: most specification languages don't have I/O or string manipulation. This gets interesting when we talk about programs written as specifications: for example, test suites. One of the most fundamental logical operators is implication: A => B ≡ !A || B
. This is used everywhere in specifications, but is nigh-useless in programming. This, in turn, makes writing property tests and contracts a lot more frustrating than it needs to be.
-
More than that, there are an uncountable number of such functions. Quick proof: for every positive real number, pick any function where
f[0, n]
is the value of the nth digit andf[0, 0]
is the number of digits before the decimal point. Then[(Int, Int) -> Int]
is at least as big as the reals, which are uncountable. ↩ -
You shouldn't do this in a real spec, since it has
O(n!)
complexity. TLC comes with aSort
function that shells out to Java. I'm just showing how to formally define sorting. ↩
If you're reading this on the web, you can subscribe here. Updates are once a week. My main website is here.
My new book, Logic for Programmers, is now in early access! Get it here.