What is a "Specification"?
What is a specification?
The most popular testing library in Ruby is called RSpec, and uses the terminology of Behavior-Driven Development, where tests are called "specifications". My first ever talk about TLA+, "your tests are not your specs", is about how this terminology is all wrong and we should reserve "specification" for formal methods. People seemed to enjoy the talk, which gave me the confidence to apply to Strange Loop, which I launched into a book deal, and now I do this for a living.
In hindsight, though, I was completely wrong. Specifications are just collections of predicates (statements) that we use to determine whether a system is correct or not. If we can show one of these statements is false, then our system doesn't match our requirements.1 Under this definition, there's nothing wrong with saying tests are specs. If I write
spec1 = expect(Max([1, 3, 2])).to eq 3
Then that's a statement about the correctness of Max
: it is required to return 3
when passed in [1, 3, 2]
. Key here is that I can give someone this assertion and say "Check and make sure your code satisfies this", and they can say "yes it does" or "no it doesn't." Presumably, they will not say "my Max
shouldn't satisfy this, your requirements are wrong", at least not without a long stakeholder meeting.
(Not all tests are specifications: if you have to change tests when doing internal refactoring, then they don't correspond to requirements. Tests only become specifications when they change iff the requirements do. And in writing that, I finally get the supposed difference between BDD and TDD, though in practice they're mostly similar.)
The problem isn't that the test isn't a spec, it's that it's an insufficient spec. There's a large space of programs that satisfy spec1
. Here's one:
def Max(l)
# max linear distance from origin
l.map(&:abs).max
end
We could of course add more tests to further "pin down" the value of the Max
, but there's always stuff you didn't rule out. The deeper problem is that you're now admitting that spec1
doesn't actually comport with your requirements, since plenty of things pass it while still being incorrect. There's some specification in your head, and spec1
is only a partial fragment of the actual specification.
So we already implicitly decouple specifications from example tests. It's sensible to make that decoupling explicit and that spec1
is a verification of the specification. Here's the specification of max (over finite sets of numbers), written as a logical expression:
∀x, set: (Max(set) = x) ⇒
x ∈ s ⋀ ∀y ∈ s: y ≤ x
In English: for any set, the maximum value of the set is a number in the set such that all numbers in the set are less than or equal to it.2 This is something that holds true regardless of what set of numbers we pass into Max
. Once we have a specification, we can verify it in many different ways:
- We can write example tests, as we've done so far, to check specific inputs.
- We can write property-based tests. Instead of checking a specific input, we sweep an area of the input space.
- We can add static types. This totally verifies some part of the specification, such as "
Max
returns a number", but quickly gets messy when you try to push past that. Even something like "Max
returns a number that's in the set" is tough in most type systems. - We can use a theorem prover or an SMT solver to prove it matches for all inputs. This is the most comprehensive, but also the hardest to do.
That's specification of individual code snippets. "Formal Specification" a la TLA+ is something a little more involved. The trick is that once we have a distinction between "the spec" and "the verification of the spec", we're now talking about "specifications" as distinct artifacts. That means we can choose 1) the format we write specifications in and 2) the thing we decide to verify.
High level "design specifications" target verifying "abstract machines" of systems. This is an essential behavioral model of something that fulfills our requirements, stripped of all the fiddly bits that are necessary to implement a consistent system but don't comport to our requirements. For example, saving something to a database in code might look something like:
1. Log
2. Initial event hooks
3. Validate
4. Find database adapter
5. Generate SQL
6. Get DB connection
7. More event hooks
8. Write to database
9. Check success
10. More logging
11. More event hooks
While in the abstract machine, it could be:
1. Update db variable
The other ten steps are important in a real world system, but they're orthogonal to understanding if we satisfy the user requirements. So it makes sense to write a specification of an abstract machine, verify the machine matches our requirements, then turn around and use that abstract machine as the specification of our code.
(Early method wars people called writing the abstract machine "analysis", which I don't like, because I don't expect "analysis" to produce an artifact. I'm used to using "analysis" to mean "discovering the what we should worry about in this problem domain", a la binder full of questions.)
The tricky bit is verifying our code against the spec-machine. Doing full verification is, to use the technical term, balls-hard, as full verification always is. Partial verification, on the other hand, has accidental complexity. Before, we were verifying code against tests in the same language, while now we have two separate languages, meaning we have to write a translation layer. I've done this before and know several companies who've done a lot more, but for now this has to be individually written per project and codelang/speclang pair. There's no general solution… yet.
Even so, adding that extra specification step helps you build safer systems more quickly, as you can verify you have the right abstract machine before you spend weeks and weeks implementing it. So here's where I stand now:
- Tests absolutely can be specs, and it's absolutely valid to think of tests that way.
- It's better, though, to think of tests as being verifications of specs, which exists as a distinct artifact.
- There are many other ways to verify a spec, including types, assertions, and theorem proving.
- Once we have distinct specifications, we can adapt them for different purposes, including the specification of abstract machines.
- We can verify the abstract machine matches our specification, and then use the verified machine as a reference point for the implementation.
- For simplicity, we call the abstract machine "the spec", too.
- This gives us formal specification languages like TLA+ and Z.
Unfortunately, there's no elegant, easily-understood word for "the abstract machine", because all the usual words you'd use are conflatable with more popular concepts. "Design" is often taken to mean "code layout and module structure", "modeling" brings to mind UML, and you've already seen the ambiguity in "specification".
Logic for Programmers Update
Well behind where I want to be, mostly due to my complete inability to focus on one thing at a time.
Consulting
Now that things are opening up again, I'm up to travel to client sites again. I'm available for formal specification workshops, coaching, and review. See my page here.
New Blog Post
Cross-Branch Testing. It's mostly the same as the newsletter from back in 2020. I'm working on adapting some other pieces, but they're ground-up rewrites. I just wanted to make this one more accessible!
-
"Correct" as in "conforms to requirements". There are many things we can conform to, such as performance. For the purposes of this newsletter I'm exclusively talking about requirement specifications. ↩
-
This spec is admittedly incomplete: what if the set is empty? You can imagine many possible
Max
functions that do different things on empty sets.(Also, more esoterically, it's possible for an infinite set to not have a maximum. But I left it out for simplicity.) ↩
If you're reading this on the web, you can subscribe here. Updates are once a week. My main website is here.
My new book, Logic for Programmers, is now in early access! Get it here.