Defects are not the fault of programmers
Sorry this one is late! I've been having a rough time lately, which I think is pretty normal right now for everyone. But I'm doing a bit better now and hope to stay that way for the next month or so. Hope.
Anyway, yesterday a bunch of us were trying to explain to someone on Twitter why we don't like Robert Martin, and I brought up his claim "Defects are the fault of programmers. It is programmers who create defects – not languages". I said this was a major flaw in his philosophy. The person asked:
Regarding defects, who else is to blame if not the person who introduced them?
Which is way too complicated question to answer in a Tweetstorm. It's also not something I really have capacity right now to write a fully researched essay on, so you will get my thoughts as newsletter. Hooray!
So on a surface level, that sounds obvious. Defects come from code, programmers write code, therefore defects come from programmers. As a statement of fact, this doesn't "coincidentally" hold. A lot of the nastiest bugs come from components that are all correct in isolation but interact in a dangerous way. This is called a Feature Interaction Bug. You can make a case that it's both people's fault, or neither people's fault, or the manager's fault, or who knows who's fault. Another set of dangerous bugs come from ambiguous requirements. This is where there are multiple ways of interpreting the requirements that are all completely reasonable, words possible even the client doesn't know which one they intended.
I call these "coincidental" counterexamples because they don't actually refute the core idea, only the specific claim. Both feature interaction and ambiguous requirement bugs are amenable to upfront design methods like formal specifications. So you can argue that this is the programmer's fault for not writing a spec. It shows that the situation is more nuanced than the simple statement of fact assumes, but it doesn't get to why I'm so against it.
No, the bigger issue with "programmers create defects" is an essential issue with the worldview. Its basis in facts is less important than its value as a "lens" we can use to understand defects. If we know where defects come from, we know what we need to do to manage them. To emphasize this: Our worldview on defects shapes our means of reducing defects. We measure the value of a lens by how useful it is.
So what does the "defects are the fault of programmers" worldview tell us about our means? It says that the only way to reduce defects is to improve programmer discipline. Either the programmer is sufficiently disciplined and nothing can cause defects, or the programmer is insufficiently disciplined and nothing can prevent defects. Martin himself describes his worldview as:
The cause:
- Too many programmer take sloppy short-cuts under schedule pressure.
- Too many other programmers think it’s fine, and provide cover.
The obvious solution:
- Raise the level of software discipline and professionalism.
- Never make excuses for sloppy work.
This is why Martin only recommends solutions that the programmer does:
- Unit testing (test driven development specifically)
- Code review
- "Clean Code"
- Pair programming
This is why Martin calls language features like nullable types "the dark path" and formal specifications a "shiny". None of his solutions are about the structure of the project, the tools the programmer uses, the environment they're in, etc.
Notice how little that gives us! Because all defects come from programmers, we can only talk about a very narrow range of things to eliminate defects. Anything else is just a sloppy shortcut, regardless of how well it actually works. Maybe the problem is everyone is overworked and we need to hire another programmer? No, that's just an excuse for sloppy work. What if our microservice architecture makes certain kinds of bugs a lot more likely? It's still you're not having discipline.
In the safety science world this is derisively referred to as "human error". Human error is where people stop their investigations. Just scapegoat the human and let the broader system get off scot-free. Don't ask if the equipment was getting out of date, or if the training program is underfunded, or if the official processes were disconnected from the groundlevel reality. When all defects are the fault of the human, then we don't have any tools to address defects besides blaming humans. And that's completely inadequate for our purposes.
I want to point to one example about the gulf between "defects are the fault of programmers" and "defects are complicated properties of systems". Back in 2018 there was a big kerfuffle in the JavaScript world, because one widely-used package was taken over by a malicious actor. Everybody blamed a human in the postmortems. Many blamed the original package maintainer for giving it to malicious actor, many blamed the companies who upgraded the dependency without auditing the code first. I'm not part of the JavaScript world but wanted to practice my accident analysis skills, so I dove in. The final analysis took two months to write and identified 15 properties of the broader system that made the attack viable and likely to happen again in the future. Some of them are easily fixable, others involved fundamental trade-offs between usability and security, or even between security and different security. And all them was missed by everyone who just wanted to blame a programmer.
"Defects are the fault of programmers" may sound good, but it does nothing to help us fix defects.
If you're reading this on the web, you can subscribe here. Updates are once a week. My main website is here.