Nondeterminism in Formal Specification
How to use nondeterminism to write better specs.
Just an unordered collections of thoughts on this.
In programming languages, nondeterminism tends to come from randomness, concurrency, or external forces (like user input or other systems). In specification languages, we also have nondeterminism as a means of abstraction. Say we're checking if a user submitted the correct password. A software implementation might look something like this:
# pythonish
def attempt_login(user, pw):
submitted_hash = hash_password(pw)
actual_hash = db.get_password(uid=user.id)
if submitted_hash == actual_hash:
login()
else:
reject()
There's lots of implementation details in confirming the correct password: how we hash it, how we determine the right hashing mechanism, where we retrieve the actual password from, what query we use to retrieve it, etc. At a higher level of design, though, we might not care about any of that: we only want to model what happens if the password is correct, not how we determine correctness. So we can use nondeterminism to handwave that all away:
\* TLA+
AttemptLogin ==
\/ Login \* \/ means `either`
\/ RejectLogin
When the model checker tries to evaluate AttemptLogin
, it will branch and create separate checking executions for picking Login
or RejectLogin
. This gives us the consequences of "user submits the wrong password" without having to write a bunch of spec figuring out what the wrong password is.
Mastering abstraction-nondeterminism is key to writing good specifications. It's why you can model a complex, detailed system with a much smaller specification and find bugs that map back to actual possibilities.
Two challenges of nondeterminism
- Nondeterminism is incredibly capable, which makes it not very tractable (ibid). It's hard to express properties on subsets of the model-space, like "If a user never gets rejected, then property XYZ always holds."
- Abstraction-nondeterminism pushes the spec further from the real-world implementation, so it's harder to see that your implementation matches it.
These tend to be bigger problems if the nondeterminism is deep in your design, kinda like functional core, imperative shell. We want to push it closer to the edges:
AttemptLogin(pwd) ==
IF pwd = GOOD THEN
Login
ELSE
RejectLogin
Next ==
\E pwd \in {GOOD, BAD}:
AttemptLogin(pwd)
It's still nondeterministic, but it happens earlier, so is easier to control. We could for example swap {GOOD, BAD}
with {GOOD}
to guarantee that login is always successful.
In the extreme case, each component of the machine is locally deterministic, even if the components are concurrent. This is closest to something implementable. The world (ibid) can continue to be nondeterministic.
Initial states and prophecy variables
Specification languages can also have multiple initial states. We can use this to push nondeterminism into the spec setup by adding variables that encode what events will happen. For example, we could initialize the variable attempts_needed
as an integer in 0..3
, then write AttemptLogin
like this:
AttemptLogin(pwd) ==
IF pwd = GOOD THEN
Login
ELSE
RejectLogin
Next ==
IF attempts_needed > 0 THEN
/\ AttemptLogin(BAD)
/\ attempts_needed' = attempts_needed - 1
ELSE
/\ AttemptLogin(GOOD)
Now, for each initial state, the spec is wholly deterministic, and thus tractable. We can easily distinguish properties that hold if the user never sends the wrong password (attempts_needed = 0
) vs if they get it wrong a few times.1 We can also use this to check only a subset of the specification at a time, which can help with checking very large state spaces.
Leslie Lamport (the inventor of TLA+) calls these kinds of helpers prophecy variables, because they "predict" what nondeterministic action the system will take. You can't always use prophecy variables to completely remove nondeterminism from your specification, but it can help!
Refinement and nondeterminism
One of the main use cases of refinement— proving one specification implements another— is to remove abstraction nondeterminism from a system. I really need to add something about this to learntla.
An Exercise
I think this might be a good exercise to practice writing nondeterminism:
In the game Get Fifteen, players take turns picking numbers
1..9
. Players can pick each number only once and cannot pick numbers their opponent picked. Winner is the first player to have three picked numbers that add to 15. This is identical to playing TicTacToe, but is easier to formally specify (no grid).2
Model a player tictactoe against an AI. Either can start first. The player needs to follow a deterministic strategy you implement. The AI just nondeterministically picks any available number. The property to check is "the player never loses".
(I haven't actually modeled this, but if people give it an honest shot and run into problems I'd be happy to give the specs a look!)
Plan for June: Logic for Programmers
I've decided to make a serious attempt to write an acceptable first draft of Logic for Programmers by July 15. If I don't have something I can readily turn into a Leanpub early-access by then, I will give $1k to a random newsletter subscriber (or donate it to a charity of their choice). I reserve the right to push the deadline if I get a new consulting contract.
-
one thing we've lost is the user can no longer fail infinite times; they're always guaranteed to log in eventually. This is fixable in a few ways. ↩
-
If you're not seeing it, try overlaying a tictactoe board on top of a magic square. ↩
If you're reading this on the web, you can subscribe here. Updates are once a week. My main website is here.