10 Misconceptions about Formal Methods

mostly

                July 5, 2021

            10 Misconceptions about Formal Methods

            Whole lotta talk about formal methods last week. Moshe Vardi dropped an ACM piece, Jean Yang wrote an essay, and Nicole Tietz-Sokolskaya frontpaged Hacker News. I missed most of these as they happened because I was fucking around with ML art working on important consulting stuff. Looking back at the discussions, though, I saw a lot of the usual misconceptions people have about formal methods. It's a very interesting and important field, but not one that programmers are familiar with, and that leads to a lot of misunderstandings. This is an attempt to address some of these. Disclaimer, I specialize in a couple of these tools, but believe I have enough familiarity with the rest to present them accurately.
(To be fair, a lot of these myths are actually mostly accurate, but they don't paint the full picture, where the full picture is different enough to make FM considerably more appealing.)
Formal Methods is about proving code correct
"Correct" is a dirty word in FM. In FM, we verify conformance to a specification: a rigorous, unambiguous description of the properties of our code. "Correctness" is something the client decides. We generally write the spec under the assumption that it represents "correctness", so that conformance to spec means "correct", but that's not always true. If the spec is wrong or incomplete, then conformance doesn't mean much.
This sounds like a quibble, but it's very important for two reasons. First, it's more honest about the limitations of FM. You might hear FM people say "the environment is outside the scope of the proof." That's us being very careful about what we can and can't guarantee: If our spec doesn't explicitly include environmental effects, we can't say anything about how our program behaves under environmental effects. Second, knowing how to write good specs is just as fundamental to FM as knowing how to verify them. In fact, I'd argue that "writing good specs" is the most useful part of the FM discipline for general programmers. It's something that's very hard to outsiders, but very easy once you have some practice, and very useful in general. Being able to see a function and go "ah, here's the spec" is useful for writing tests, assertions, types, everything. I also find that my workshops students need a lot of practice with it, regardless of the specific formal method they're using.
Formal Methods is about proving code
It is true that code is the most common target of FM. But anything that can be "computationally encodable" can be verified. This includes broad-scale system designs (TLA+), requirements (Alloy), language semantics (K), business processes (BPMN), hardware, ontologies... Most of the focus has been on code, because there's a whole lot of it, and people are less likely to realize they can encode other stuff in a verifiable way. Which is a shame: we have lots of tooling besides FM to help us write correct code but almost nothing to help us write correct other stuff. Making that other stuff verifiable gives us tools we don't otherwise have, and I think that's where formal methods can do the most good the most quickly.
Formal Methods is about proving
Again, mostly true, but proving implies you're writing a full mathematical proof yourself. That only happens when you're working with a proof assistant like Coq or Isabelle. Most of the time people do formal verification via tooling abstractions: model checkers, SMT solvers, simulations, typecheckers, etc.
This might seem like a minor terminology quibble, but it has big impact in how we think about the field. It implies that you need a strong background in pure math to get anything done. While knowing math certainly helps, there are lots of tools you can use without having to write full proofs yourself. Take a gander at Let's Prove Leftpad, a collection of almost 20 proofs of leftpad in different formal verification languages. Some of them are full step-by-step proofs. Some of them look like this:

That's the entire verification of leftpad in Idris. At that point, you're not really writing "proofs" anymore: the computer is doing most of the proof for you. This makes the field a lot more open to newcomers who'd otherwise be scared off by writing mathematical proofs.
FM requires a special language
Here's a formally verified LeftPad in C. Frama-C retrofits a specification language onto C, so you can use formal methods on C code. Most enterprise languages have an FM retrofit available. These languages aren't designed to make FM easy, though, which is why people do a lot of work in specialized FM languages. But you can still use FM on legacy projects.
(Generally developer-oriented affordances make verifying code harder. Things like dynamic dispatch, free aliasing, user input, etc. There are also "hybrid" languages like Ada, where a subset of the language is designed around verification and the rest around general usability.)
FM requires a PhD
You need a lot of expertise to use a proof assistant, where you're writing most of the proof by hand. The further you get from proof assistants, the less academic background you need. I'm not saying easy, I'm just saying it's something everyday devs can learn to use well (with time investment). 
Proof assistants dominate popular imagination for two reasons:

They've been around the longest. It's only in the last 15ish years that you could verify lots of code without using a proof assistant.¹
They're the most "powerful". Automated tooling isn't smart enough to verify a compiler or an entire operating system by themselves, so if you want to do anything that ambitious you need to use a proof assistant.

FM must be end-to-end
People think that FM is only useful if you can verify all properties of your entire program: a full verification of an end-to-end spec. This is true if you want absolute assurance. You can also do partial verification, where you verify less in a way that's cheaper. You could:

Verify some properties of your code. This is the already pretty common in mainstream languages, properties like "this is type safe" or "this does not mutate an aliased pointer." Using FM just extends that to arbitrary properties.
Verify properties of some code. You have a "core" that's formally verified and surrounding code that's informally verified. This is what a lot of industry companies do. You can also do this with individual libraries, which are then included in larger projects.
Verify properties of a model of code. This is what you do when you verify "design specifications" a la TLA+. You still have to write the code yourself, though.

I've seen all of these work well in practice. Really depends on the problem domain and what you consider important.
FM is too expensive
The public perception is that FM is only worthwhile on incredibly expensive mission-critical systems: "nukes 'n NASA", as I think of it. This is because most people consider the tradeoff as

time learning FM + time applying FM vs financial cost of a bug in production

Since most bugs don't cause that much financial damage, the time spent applying FM has to be really small to make this all worthwhile. Which means you can't do much with it.
I think a more accurate tradeoff is

time learning FM + time applying FM - time saved applying FM vs financial cost of a bug in production + time spent debugging

There are lots of bugs that are only mildly financially damaging but are brutal to find and fix. Eliminating them beforehand doesn't save much raw money but does save a lot of developer time. More important is the "time saved applying FM", which takes a lot of people by surprise. Why would spending more time on verification save time on the project overall? Lots of ways!

If you find an ambiguous requirement early, you can check with the client instead of making a guess you'll have to undo later.
Verifying the high-level system in advance can quickly eliminate dead-ends and design landmines before you start coding. It can even lead to simpler implementations that take less time to write.
Writing specifications of code, even if you don't formally verify them, is a great way to surface pathological edge cases in advance. Then you can use the spec to guide informal verification, like property tests or runtime checks.

It's rare that FM will save you net development time over not using it (though I've seen it happen). But the design speedup and the time saved on debugging (which can be significant) tilt the economic cost of applying FM, making it economical in a wider range of settings.
(We can push the balance further by making FM tooling easier to learn and use. I'm really proud of the work I've done on FM pedagogy, which opened the discipline up to a lot of people who otherwise wouldn't have bothered.)
FM is only for large companies
As my readers are no doubt sick to death of hearing, I first started writing formal specifications at a company with ten engineers. In the course of my teaching, I've heard from people who've successfully applied it at even smaller companies, even in cases with just one or two engineers. I think you're more likely to see FM at larger companies because they have more projects that "obviously" benefit from it, and with more engineers comes a bigger chance of an individual championing the discipline.
I do believe that verifying systems is beneficial to a wider range of companies than verifying code is, which is why I specialize in teaching TLA+ and Alloy and not Coq or Dafny. But I'm always on the lookout for new languages or technique improvements that would help code verification hit mainstream, and I'm confident we'll start seeing big advances in affordability over the next few years. 
FM doesn't work if the requirements change
"FM tells you if you built the thing right, but not if you built the right thing." This is just an extreme example of a basic fact about client requirements: if they change, you're going to have to throw work away, whether that's just some code, a full test suite, or a mathematical proof. It's just that unit tests are cheaper to write than formal proofs, so you're out more money with FM.
The problem is this is usually trotted out as a universal fact of software, when it's only true for a subset. Requirement changes happen most often when you're writing software for an outside client, like you'd see in consulting. If you're making internal software, or selling a software product, then you have more control over requirement changes. You're also more likely to be maintaining legacy software and writing infrastructure code. If you need to move a nightly batch job to a realtime task queue for performance reasons, it's incredibly unlikely your boss is going to say "nevermind put it on a blockchain". I mean they could, but it happens less often. Project timelines are longer and so FM is more valuable.
FM is necessary for correctness
Let's close with a misconception on the other side of things: the common idea that without formal methods, software will never be "good". The poster child for this misconception is Edgar Dijkstra's quote:

program testing can be used very effectively to show the presence of bugs but never to show their absence.

Funnily enough in the exact same lecture he says

Instead of trying to devise methods to establish the correctness of arbitrary, given programs, we are now looking for the subclass of what I would like to call "intellectually manageable programs", which can be understood and for which we can justify our belief in their proper operation under all circumstances without excessive amounts of reasoning. 

The whole piece is actually about structured programming! Funny that. 
Regardless, there's a big difference between code being correct and being verified correct. You can determine code conforms to a spec to a very high degree of confidence without needing a proof: sort of being "99% sure it's 100% correct." You can do this with really thorough testing, or really thorough code review, or following strict processes, etc. Oftentimes these will be cheaper than full end-to-end code verification. Even with "mainstream" techniques like testing, CI/CD, and regular code review can get you really high confidence in your correctness. You don't need formal methods for any of this.

Hopefully this cleared up some basic misconceptions on formal methods! Probably not all of them, but there's only so much I can do in a single newsletter. If you're interested in learning more, I've got an old piece called Why Don't People Use Formal Methods, which is both a general intro and a discussion of the practical challenges. Hoping to drop a spiritual sequel later this year! 
Also I consult in this stuff, yada yada

Consulting Updates

I'm doing in-person corporate trainings again. Reach out if you want to discuss one.
Since a couple people asked: I'm available to review and provide feedback on existing specs. I can consult on TLA+, Alloy, and PRISM, though I highly doubt there's much interest in that last one.

TLA+ Conf Deadline Extended
Now this Friday to accommodate 4th of July vacays. We're also accepting remote submissions!

Proof assistants have also gotten better in the past 15 years, with things like Sledgehammer removing a lot of the tedium. ↩

            If you're reading this on the web, you can subscribe here. Updates are once a week. My main website is here.
My new book, Logic for Programmers, is now in early access! Get it here.

Don't miss what's next. Subscribe to Computer Things:

Start the conversation: