Defense in Depth is actually a good thing
So my TLAConf talk is out and I was gonna talk about technique research but then I saw this tweet and knew I had to rant about it:
If after-release testing uncovers bugs, I ask: why doesn’t a culture of quality infuse the entire product-creation process (including development)? IMO, after-release testing should be unnecessary, because it should never find anything.
— Allen Holub allenholub.(mstdn.social,bsky.social) (@allenholub) October 12, 2021
So, standard disclaimer: don't brigade. I'm using his tweet as a launch point to develop some ideas, I'm not trying to pick a fight or start an argument. Also I'm gonna make up a lot of terms because I like making stuff up and am in a ranty mood.
Okay, so Allen is an "Agile Extremist": he espouses a set of core Agile-affiliated techniques (TDD, mobbing, team autonomy) as a universal solution to software problems. If the techniques aren't sufficient, then it's a problem with the team, not the techniques. In this case, post-release QA is bad because a team properly doing Agile should have caught the bugs in development, so there must be something wrong with your process if you have QA.
This is something I find ridiculous, and that's because I believe in the antithetical concept of Defense in Depth. DiD is based on three propositions:
- Murphy's Law: software defects are "continuous and uniform" across all system dimensions. If our system talks to an asynchronous calendar service, there are system defects arising from the communication protocol, the asynchronicity itself, their service properties, their service defects, and the calendar domain (like leap years). You can think of bugs as arising from a nebulous "entropy" exerting constant pressure on our system.
- Verification Asymmetry: Any kind of verification or defect-elimination solution is discrete, finite, and nonuniform. Different techniques are better at different things, but none are good at everything. Solutions are implemented by time-bound individuals.This creates an asymmetry between continuous and uniform bugs and discrete nonuniform solutions.
- Diminishing Returns: even inside a technique's specialty, it takes far more investment to go from being 80% effective to 90% effective than it takes to go from 20% effective to 80% effective. This is because defects in a category share a lot of similarities that can all be addressed at once, while the more complex bugs have particulars that need to be handled on a case-by-case basis.
Those three principles mean that any one solution is going to be inadequate for eliminating all defects in software. Further, investing in one solution is inefficient. Past a point, while further investments will continue to have some benefit, it won't have as much benefit as also investing in a completely unrelated solution.
So the pragmatic approach is Defense in Depth: Use a variety of different complementary solutions. Ideally, ones that don't overlap in their specialties, so that you can catch a wider set of things.
You probably already do DiD implicitly. If you're writing tests for code in a statically-typed language, that someone else will later code review, that's three layers of defense. You won't as good at writing tests as a testing-only extremist or as good at coming up with types as a types-only extremist, but of the three you will probably catch the greatest volume of bugs.
DiD is antithetical to Agile Extremism, and in fact is antithetical to any form of extremism:
- Extremism advocates a few "blessed" solutions and rejects everything else. DiD is all about using everything, even things you don't like, and admiting the limits of the stuff you do. If I was a Formal Methods Extremist, I'd say testing is a waste of time because you have the specification. But I'm not, so I don't.
- Extremists brush off all the flaws of their paradigm. The $THING can't introduce new problems, it's your fault for doing $THING wrong! Whereas DiD has some really obvious issues you need to grapple with: cohesion and coupling.
Cohesion is Bad
Ever hear the canard "TDD is not about testing, it's about design"? Saying "TDD is not about testing" is silly and ahistorical, but there's a legit argument for it being about design: using TDD changes the shape of your production code.
So do all other verification techniques. Each technique puts pressure on the code to be more amenable to that technique. Think of it like ergonomics: if your chair is uncomfortable, you'll unconsciously change your posture to be more comfortable. Similarly, if the code makes using the technique "uncomfortable", over time you will adjust the code.
"TDD is design" people would argue that TDD's changes are overall a net positive. I don't want to get into that argument here. But DiD means using a plethora of solutions, and every solution will exert code pressure. Often, in incompatible ways. Maybe adjusting your code to one influence will be ok, but adjusting your code to sixteen influences is going to lead to more complicated code.
This complexity then makes each individual solution harder to apply, creating a feeling that the other techniques "just get in the way". If you just did TDD and skipped the whole "code review" thing, your code would be so much simpler and easier! If you just did code review and skipped the whole "TDD" thing, your code would be so much simpler and easier!
Coupling is Bad
Defense in Depth works best when the layers are complementary and indepedent. Whether approach A will catch a particular bug has no relevance to whether approach B will catch it. This is sometimes referred to as the "swiss cheese" model. If you have a bunch of slices of swiss cheese in a line, a dart could pass through some of the holes but would eventually hit solid cheese.
In practice, this is very hard to maintain. Some of it is because various techniques naturally overlap, but a lot of it is social, too. Every layer of the DiD takes investment to build and maintain. This includes investment of people's time: if I'm not comfortable doing TDD, I'm not going to do a very good job with my TDD, and if you force me to because "we need Defense in Depth", I'll be resentful that you're wasting my time.
This leads to an insidious, emergent problem: layer coupling. Without proper maintenance, the layers start to "slip" and converge on finding a central commonality of defect. This "couples" them, severely reducing your scope of coverage. This is a known problem in safety engineering and one reason why systems-oriented approaches like STAMP are gradually replacing swiss cheese models.
But DiD is good
Now, here's the difference between advocating DiD and advocating extremism. Extremism believes everything is flawed except $THING, DiD believes everything is flawed including DiD. We do defense in depth because the benefits are bigger than the drawbacks. Coupling and cohesion are things we accept we will have to deal with, not things we pretend don't exist.
And we know DiD works because everybody successful already does it. Even if they don't know they are, even if they haven't built out a theory of what it looks like. If we did it more consciously, we could see even bigger benefits.
One such benefit: each layer gives feedback to the other layers. Imagine we have two layers, tests and code review. Bug X is caught by code review but not tests. Fair enough, bug X would have slipped into prod without DiD, all good so far.
But wait! Now that we caught X, we can ask "is this normal"? Should we have expected X to slip by our tests? Maybe it's something too subtle, such that changing our practices to catch it would be a poor use of resources. Maybe it's an off-by-one error we really should have caught, and we should figure out how to write more comprehensive tests. If we had more than two layers, we could see how many layers would have caught the bug, giving us a better sense of our coupling and independence.
And that benefit answers Allen's followup question:
I’m asking why the teams are not finding the problems that an after-the-fact QA org finds. What’s wrong with the team’s process that bugs leak out?
Because you need to know what slips through to know what's wrong with the process. Without that information you don't have anything to feed back into the other layers! That alone makes after-the-fact QA useful to our DiD efforts.
(The other answer is "who the hell thinks you can consistently deliver bug-free software?" Even formal methods people don't think that!)
Okay I promised a bonus tech newsletter this week and that was it, hope you enjoyed a++ goodbye now
If you're reading this on the web, you can subscribe here. Updates are once a week. My main website is here.
My new book, Logic for Programmers, is now in early access! Get it here.