Optimizing State Spaces with Combinatorics

somewhat

                September 27, 2021

            Optimizing State Spaces with Combinatorics

            I'm preparing my talk for the TLA+ Conference this week. I asked around about intermediate stuff people wanted me to cover and there was a lot of interest in model optimization. The TLA+ model checker, TLC, brute forces every possible reachable state. If there are lots of possible states then it'll take a very long time to run. The fundamental skill of optimization is being able to estimate how many states there are and how many states a safe change will eliminate.¹
There's a connection to another field of verification: property-based testing! In PBT you take input generators and run them through tests. For example, you might generate random lists of up to five integers, where each integer is between -10 and 9. If the sample space is too big, most inputs will be uninteresting. Part of getting better PBT coverage is estimating how many inputs there are and how many a safe change will eliminate. There's similar connections to simulations and constraint solving, where things are "slow" for related reasons.
The relevant branch of math is combinatorics: very roughly, the discipline of how to count finite sets. Combinatorics isn't too well known outside of math, and it's not all that fashionable in mathematics either. Which is a shame, because it's applicable to a lot of problems.
Combinatorics
Three simple combinatorial statements:

There are 2^N possible binary strings of length N. There are 1^N unary strings and 3^N ternary strings. For finite sets A and B, there are |B|^|A| functions from A to B.
If we have N distinct elements, we can order them in N! ways. You can see this by starting with the ways to order one element, then two, and proceed inductively.
If we have N distinct elements, there are N!/m!(N-m)! ways to select m elements. This is the "combination" function, often written as "N choose m" (or C(N, m)).

The combination function pops up everywhere. You might know that (a+b)² = a² + 2ab + b². The coefficients are 1, 2, 1, or C(2, 0), C(2, 1), C(2, 2). Similarly, for (a+b)³ = 1a³b⁰ + 3a²b¹ + 3a¹b² + 1a⁰b³, the coefficients are C(3, 0), C(3, 1), C(3, 2), C(3, 3). This is also, uncoincidentally, row 3 of Pascal's triangle (starting from 0).
Let's use this to estimate some state spaces!

There are 4 servers, where each one may or may not know about any of the others. How many possible network configurations are there?

Each server may be connected to 3 others, so there are 4*3 possible connections. Each connection may or may not exist, meaning a total of 2^12 = 4096 configurations. Assuming connections are symmetric (if A knows about B then B knows about A), we halve the number of possible connections, leading to a somewhat more manageable 2^6 = 64 configurations.

We have two threads, one doing 3 sequential atomic actions and the other doing 4. There is no interaction between them. How many possible states in the state space are there?

First let's assume that the actions are in any order: that gives us (3 + 4)! orderings. Now if we pick an arbitrary ordering, there's a 1/3! chance that the first thread has all of its events in sequence and a 1/4! chance that the second thread has all of its events in order. So of all the orders, only 1/3!4! are valid, meaning the total set of valid orderings are (3 + 4)!/(3!4!) = 35. Notice that for the case of two threads, this is the same as C(7, 4).
The state space will be less than 7*35 = 245 elements. Less than that because it only "reaches" 35 possibilities at the end. There's some shenanigans you can do with Pascal's triangle to get a more exact bound, but it'll be within an order of magnitude of this, and state space explosion is all about orders of magnitude. 
Optimizations

A property test runs through 1000 examples. The input strategy is a list of five elements, each between -10 and 9. What percentage of the state space is checked each run?

With 5 elements, that's 20⁵ possibilities, or 3.2 million, meaning the property test hits about 0.031% of the elements. If the size of the list is up to five elements, we add in 1 + 20 + 400 + 8000 + 20⁴ more elements, but that only reduces the coverage to 0.030%. We can see the 20⁵ term dominates, so it makes sense to focus on optimizing that.²

What if we instead do -5 to 4?

Now there are only roughly 10⁵ = 100,000 possibilities, meaning our state space coverage jumps to roughly 1%. Much better, but still pretty low.

What if we only do four element lists?

This cuts an exponent off the top, so we go from n⁵ dominating to n⁴ dominating. With the old range, we'd have 0.6% coverage, while with the new range, we'd have 10% coverage. 

What if we only allowed unique elements?

This is equivalent to pulling a set out and counting each permutation as a seperate possibility. For 5 elements and -10 to 9, that's 5!C(20, 5) = 20!/15! = 1.9 million, so not a significant optimization.

What if we only allowed unique elements and sorted the list?

Now it's just C(20, 5) = 15504, meaning we're now hitting 6% state coverage. Using sets instead of lists is roughly the same order of magnitude state space reduction as halfing range and reducing the maximum list size by 1. 
Which is a better optimization? Depends on the problem- if you're sure the order (and uniqueness) doesn't matter, then using sets means you can try a larger range of values for the same state space coverage. If it does matter, then this is an unsafe optimization.
The point is you can roughly gauge the impacts of these optimizations without having to test them first. You can see that sorting has a bigger impact than uniqueness, that halfing the range has a bigger impact than going from five elements to four elements, and that the state space is dominated by the largest possible list.
This all can help you find the best optimizations to your problem. And just as importantly, not waste time on poor optimizations. I've definitely spent hours debugging a "really clever" state space reduction when five minutes on estimating would've shown was a waste of time.
In general I've found "weeding out bad ideas" is more accurate than "revealing good ideas." Optimization is unintuitive, and many good ideas end up worse than the combinatorics estimate would suggest. But it's very, very rare for an optimization to outperform the combinatorics estimate.³. If the estimate says an optimization won't help much, it probably won't.
Symmetry
Since someone will mention it if I don't, lots of good optimizations are exploiting symmetries in the problem. Like if you have two servers with connections, the configuration {(A, B)} might be symmetric to the configuration {(B, A)}. Instead of four possible configurations, you have three: {}, {(A, B)}, and {(A, B), (B, A)}. For larger sets of servers, the state space reduction factor approaches N!.

Strange Loop
I'll be at Strange Loop with a sack of homemade chocolates. Come say hi! 

By "safe" I mean a simplification that doesn't change a failing model to a passing one, ie by eliminating an invariant-violating state. ↩

Assuming you're picking uniformly. If you first pick the list size, then the elements, you'll get biased coverage. Consider that you have a 16% chance of picking a list of length 0, which has only one element. If your PBT engine isn't Sufficiently Smart you'll need to fix it to exactly 5-element lists. ↩

It's easy to make the mistake of only testing your positive hypotheses, and never checking your negative ones. So sometimes I'll do an estimate, find out an idea is bad, and then try the idea anyway to make sure the combinatorics check isn't filtering out good optimizations. So far it hasn't. ↩

            If you're reading this on the web, you can subscribe here. Updates are once a week. My main website is here.
My new book, Logic for Programmers, is now in early access! Get it here.

Don't miss what's next. Subscribe to Computer Things:

Start the conversation: