A sufficiently comprehensive spec is not (necessarily) code
If you're a PEDANT
Sorry for missing last week! Was sick and then busy.
This week I want to cover a pet peeve of mine, best seen in this comic:
A "comprehensive and precise spec" is not necessarily code. A specification corresponds to a set of possible implementations, and code is a single implementation in that set. As long as the set has more than one element, there is a separation between the spec and the code.
Consider a business person (bp) who asks:
I want a tool to convert miles to kilometers.
Is this a comprehensive spec? Maybe, you can give it to Claude Code and tell it to make all design decisions and it will give you a program that converts miles to km. At the same time, there is a huge amount of details left out of this. What language? What's the UX? Should it be a command line script or a mobile app or an enterprise SaaS? For this reason, if we gave Claude's output to the bp, they'll probably be unsatisfied. The set of possible implementations includes the programs they want, but also lots of programs they don't want.
So they now they say:
It should be a textbox on a website.
Okay, this rules out a lot more stuff, but there's still a lot to decide. React or vanillajs or htmx? Should the output be a separate textbox or a popup? Should we use a conversion of 1.6, 1.61, or 1.609? So you could argue that this is still not a "comprehensive and precise spec". But what if the bp is happy with whatever Claude makes? Then their spec was sufficiently comprehensive and precise, since they got a program that solved their problem!
Now the comic above makes the more specific claim that a spec "comprehensive and precise enough to generate a program" is code. That wasn't even true before LLMs. Program synthesis, the automatic generation of conformant programs from specifications, is an active field of research! Last I checked in 2019 they were only generating local functions from type specifications; I don't know how things have changed with LLMs. But still, it shows that code and comprehensive specs are distinct things.
Specs are abstractions
What I'm getting at here is that a specification is an abstraction of code.1 For every spec, there is a set of possible programs that satisfy that spec. The more comprehensive and precise the spec, the fewer programs in this set. If spec1 corresponds to a superset of spec2, we further say that spec2 refines spec1. A specification is sufficient if it does not need to be refined further: no matter what implementation (within reason2) is provided, the specifier would be satisfied. A spec does not need to be fully comprehensive to be sufficient.
Programmers are still needed to write specs
The comic makes a further claim: "a sufficiently detailed spec is code" is a reason why programmers won't be out of a job, even with we could automatically generate code from specs. And this is still true.
It is often the case that we express the abstraction spec via a formal language. Normally this makes me think of TLA+ or UML or even Planguage, but the most common example of this would be test suites. Tests are specifications, too! And as a rule, it seems impossible to get nonprogrammers to successfully encode things in formal languages. Cucumber was a failed attempt to make business people write formal specs.
But does this make a comprehensive spec "code"? I'd argue no. It's possible to encode a specification in a programming language (again, test suites), but it is just that, an encoding. The spec still corresponds to a set of possible implementation programs, and the spec is still useful even if we don't encode it. Keeping "code" and "spec" distinct concepts is useful.
-
As in the implementation makes a good faith attempt to make a reasonable implementation. IE "this converts miles to kilometers and also mines crypto" is not a good faith interpretation. ↩
-
I appreciate the specificity of this point, and I love the clarity of your point that a specification is "sufficient" if every program it generates meets your needs.
I will say, though, that I think the practical wisdom of the original statement / cartoon here is that when you're talking about program generation with LLMs, there's a big difference between "every program so far" and "every conceivable program" (the phrase "comprehensive and precise enough to generate a program" is sort of masking that difference because you could read it either way). In your example, if my spec is vague on the miles to km transition (i.e. it doesn't specify whether it's 1.6, 1.61, or 1.609) then it could just so happen that every program the spec has generated SO FAR has used the precise value of 1.609, and so I assume my spec is precise enough. But a future generation / rewrite / refactor of the program could materially change that and just use 1.6, reasoning that it's still correct as far as the spec cares. But now maybe there's a half-inch gap in my boat hull, or whatever. You could protest that the spec obviously wasn't precise enough in hindsight, but that's kind of the point; language is nebulous and unless it actually generates the same code every time, there's always room for you to later learn that you were imprecise in some way that usually, but not always, gets the right result.
Point being, I think when people trot out the kind of statements the first character makes ("Some day we won't even need coders any more. We'll be able to just write the specification and the program will write itself."), or related statements (like "The spec is the source of truth, not the code"), they are overlooking this combination of LLM non-determinism and language vagueness, and thus the right response really is "If that's what you want, it's code you're looking for."
-
For every spec, there is a set of possible programs that satisfy that spec.
Then what's not a spec? ;)
Maybe a C program? But
gcc -O0,gcc -O1,gcc -O2, andgcc -O3will all produce different machine code for the same input "program". And not all of those resulting programs might satisfy the author of the original program (or should I say "spec"?), in particular, due to performance, but maybe other characteristics. So, one C program corresponds to a set of "actual programs", not all equally desirable. Thus, it's a spec.Then you might say that the machine code is surely a program. Only the same x86(_64 or not) or ARM machine code gets translated by a processor to some microoperations, and different processors implementing the same machine code have different microarchitectures. And we have software emulation and binary translation (AOT and JIT), which produce different "real programs" with different properties for the same list of machine instructions. Thus, a "machine program" is actually a spec, which describes a set of possible "real programs" that can implement it.
And the same goes for transistors implementing a particular microarchitecture. I have no idea at what point this descent stops. If ever. :)

Add a comment: