The best model checker is your head
Perceptual learning through model checking helps build intuition to find bugs in software designs.
Extra late because I just got back from Australia and slept 16 hours
Thanks to everyone who reached out last week! I'm still wrapped up with travel and Strange Loop but should start getting to them next week. Also, I'm teaching a TLA+ workshop on October 16! Register here, use the code C0MPUT3RTHINGS
for 15% off.
There's this quote I like, "the best model checker is your brain." People who do formal methods find that simply writing the spec is enough to find errors, even without actually running it. It's kind of like if writing a unit test on a piece of paper told you where the bug in your code was.
I was in Australia to attend YOW! Perth, and one of the closing keynotes gave me a good way of describing it. Katrina Owen gave a talk on how to quickly build "expert intuition": perceptual learning. Take a large category of situations and classify them. The canonical example would be chicken sexing. Imagine you're given a day-old chick and asked to guess the sex, then told if you're right or not. If you do this thousands of times, you'll eventually be able to consistently get it right, even if you aren't actively studying specific rules.
(I think during the talk she said that you start to see results after about 3 hours of practice. Maybe I need to start subitization practice again.)
So now imagine you're given a program like this:
\* pluscal!
process p \in Workers
variable tmp;
begin
Get:
tmp := x;
Set:
x := tmp + 1;
end process;
You're asked if it has a race condition. Then you're told if it does or not, and what the race condition looks like. If you repeat this enough, eventually you'll get an intuitive "sense" for race conditions. Programs with concurrency bugs will feel different than programs without them.
Of course, this only works if you get feedback on your answer! This doesn't happen often with actual programs, because determining race conditions is really hard. If it doesn't appear after 10,000 runs, does it mean it's correct, or did you need 100,000 runs?
With a spec, though, you can just run the model checker! It explores the entire state space, so it doesn't matter if the race condition happens 0.1% of the time or 0.00001% of the time, it'll still find it for you.1 This means you get the perceptual learning practice, which builds your intuition, which means you find more bugs in your head.
Yes, this actually happens
I don't have hard data and wish I did, so all I can do is speak from personal experience. I teach a lot of companies how to apply formal methods to their business. I come in knowing how to specify, they come in knowing their domain.
What I find is that I notice design flaws much faster in the spec than they do. It's not simple things, either, I mean complex interactions among multiple components. A recent one involved workers which switched configs at scheduled times, where configs told the worker which server to ping for data. So the actions looked kinda like this:
Worker:
1. SwitchConfig
2. PingServer
Server
1. UpdateData
If the worker got a response from the server, then the server updated its data, but the worker switched configs before sending the next ping, we entered a state that could lead to a rare bug. It might look simple here, but it wasn't when buried in 500 lines of Alloy code. I only could see it because I have a lot of experience manipulating the high-level abstract chunks of a spec. This kind of stuff just reaches out to me more.
I wish I could teach that spec sense in a single class, but it takes time and you need to understand the fundamentals first. Not everyone goes further than the class, that's just the natural of workshops. I regularly get messages from students who later crossed the gap and got their "modeling intuition". That always feels really good.
(I think there's ways of applying perceptual learning to learning TLA+, as opposed to learning concurrent systems. All my ideas would take a lot more time and effort than I can afford to spend, though.)
Okay that's all from me, back to Strange Loop prep. We should be back on normal schedule next week! I'm looking forward to having writing time again.
-
Of course it's more complicated than that. It could be you're model checking with 3 workers when you really needed 5, or the bug only appears in unbound models. But even if 40% of the bugs hide this way, you're still comparing "60% feedback on finding design bugs" to ~0% feedback! ↩
If you're reading this on the web, you can subscribe here. Updates are once a week. My main website is here.
My new book, Logic for Programmers, is now in early access! Get it here.