How do we get to Good Software? (Across the Sundering Seas 2020 #05)

study and reflection

                February 2, 2020

            How do we get to Good Software? (Across the Sundering Seas 2020 #05)

            Hello, fellow sojourners!
I’m Chris Krycho, and this is Across the Sundering Seas—a weekly newsletter digging into the things I’ve been reading and thinking about. Call it an attempt to keep the work of study and reflection growing in one tiny corner of our culture. If you have too many emails in your inbox and this one just isn’t doing it, you can always unsubscribe. On the flip-side, if you like what I’m up to here, would you consider sending it along to a friend who might like it? Either way, thanks for reading along this far!

Now, let’s talk software quality, prompted by two posts I’ve read over the last few weeks, on the theme of the reliability of software and the practice of ethical software development.
(A slightly painful admission: I’m breaking the normal routine for this newsletter very slightly. Instead of picking one item to link, I’m picking two, because they’re all swirling around in my notes and thinking at the moment. It’s a painful admission because, well, I maintained the precise definition of the format I switched to this year for exactly three writing weeks. I trust you’ll forgive me when all is said and done!)
The first prompt was a guest entry by software engineer William Wechtenhiser in Matt Stoller’s Big newsletter, Does Microsoft Have a Boeing 737 Max Style Crash Every Week?. The article made the rounds in a few of my circles a week ago, and I’ve been mulling on my points of disagreement with it ever since. The key claim:

We each carry a small supercomputer with unimaginably powerful sensors in our pockets. With the right policies, we could have a flourishing paradise of liberty and improve our lives in remarkable ways.
But that isn’t what we have. Instead, the policy choices of the last thirty years seem to have led to Boeing 737 Max-style crises, everywhere, but out of sight.

The second was  Kyle E. Mitchell’s post Open Source Should Come With Warranties. (Mitchell is a great read in general: a lawyer focused on open-source software who actually builds open-source software. I often disagree with him… at first. And then I chew on the things he says for a while, and then a lot of times I change my mind.) He writes:

Reality Check: Software should be good, and good software should come with warranties. Especially software that many people rely upon. Open source is not a house of cards, and it shouldn’t be papered like one.

The common theme between these two threads is an emphasis on the (lack of) quality in software.
There are a lot of problems with Wechtenhiser’s analysis—not least his conflation of all sorts of different systems under the same heading. The kinds of security problems he opens with are wildly different from Facebook and YouTube’s algorithm problems, and both are wildly different from the kinds of problems that the Boeing 737 Max software suffered. Conflating these actually serves to undercut Wechtenhiser’s other points, at least to those “in the know.”
He points to these problems as justification for the need for sweeping increases in regulation. Anyone who has actually followed how regulation has played out in the software industry for the last few decades knows, though, that sweeping regulations that attempt to address the entire software industry at a go don’t work out well. They end up either being toothless regulations which just serve to annoy users everywhere (cookie-acceptance banners, anyone?).
The very examples he offers in the piece serve to highlight why: of course regulations for plane software are stricter than the regulations for a social network. This is as it should be. Yes, social networks can have serious consequences in the world—but a bug of the same class as the Boeing software in one of Facebook’s algorithms doesn’t kill people. Even the kind of fairly dangerous operating system bug that Wechtenhiser offers as his primary example can become dangerous only when actively exploited—quite unlike a bug that will actively cause harm to hundreds of people when operators are doing the right thing!
Moreover, and rather ironically, the engineers who take security most seriously are not the ones he points at: big companies care far more about these things than the real risk vectors, which are smaller startups with less to lose and higher incentives to take big risks. And at the same time, he fails to acknowledge that different risk levels and different degrees of investment in quality and security are appropriate for companies at different sizes, with different customers, solving different problems. The tools that underpin the security of how transmit data securely—protocols like TLS—should be subject to the most rigorous security audits possible. The software I use to power my blog? Not quite as big of a deal. Not because it doesn’t matter if things get hacked—it definitely does!—but because factors of scale, centrality, and importance mean that some things deserve deep investment and others… just don’t.
We can see this by way of analogy to construction. (I don’t think construction metaphors are either as bad or as good as many other engineers tend to suggest. My acquaintance Glenn Vanderburg has a great talk on the relationship of software to other engineering fields, and I commend it to you.) We take great care in how we build houses. But we don’t build them the same way we build skyscrapers. And we don’t build them the same way we put up a tent when we’re camping. This is as it should be! We don’t want any of these structures to fall over, and all of them should be designed with an eye to making sure that people who use them are appropriately sheltered from the elements and that there are no undue risks that the structure itself will harm the people using it. But we don’t build tents or homes with I-beams, and we don’t think tarp is the right material for house walls or skyscraper structures, and we don’t consider a couple pegs or even a concrete slab to be appropriate structures to undergird a 40-story building. Different kinds of buildings have different requirements. Exactly the same is true of software, just as you would expect.
The net of this is that Wechtenhiser’s argument is quite wrong in the details both of the problems and the solutions he proposes to them—but he’s not wrong on the merits of one central claim: much of our software does need to be better, and software developers do need to continue growing to take the quality of software more seriously. To return to the other piece I’ve been reflecting on this week, I’ll just quote Mitchell again:

Reality Check: Software should be good, and good software should come with warranties. Especially software that many people rely upon. Open source is not a house of cards, and it shouldn’t be papered like one.

Here he is specifically addressing open-source software, in a post/essay talking about the dynamics and costs in play around open-source software (and again: I commend it to you, even if I have points of disagreement throughout). But this line in particular applies to all software. Software should be good. In open source, that good software would come with warranties runs right up against some of the most central tenets of the culture: where warranties are explicitly declaimed by nearly every license available.
There are good reasons for that rejection of responsibility. Imagine someone shares a small library she built in her spare time, and then Google starts using it in some critical piece of infrastructure (without doing due diligence on it) and a bug causes catastrophic data loss or a security vulnerability.  The aim of open source licenses is that, in this hypothetical scenario (and many not-so-hypothetical scenarios) this developer’s responsibility for that bug would not be a question of legal liability. We all rightly recognize that it would be grossly unfair for her to be legally liable for those damages—because the fault here, in a legal and moral sense both, would be Google’s. Everyone makes mistakes (in every industry, including fields of engineering which require licensure). But the context of those mistakes matters.
The problem is that legal licenses (like laws, and like regulations) do not just define legal norms. They also, over time, create and reinforce cultural norms. When most open source licenses disclaim all responsibility for—that is, refuse to warranty—the software to which they apply, how could it not affect the culture of professional software development? Especially when so much professional career development in many parts of the industry is about public and open-source software?
It is hard to write good software even while rejecting the idea that you are responsible in any way for the quality of that software. Something has to give. I suspect the answer is closer to where Mitchell lands than where Wechtenhiser ends up, though I do think that some very careful regulation of the software industry (not like DMCA or other similarly awfully-written laws) is probably in order. Insofar as software is increasingly central to every part of our economy, it is essential we take it seriously. But taking it seriously means, in part, doing the hard work of treating machine learning algorithm training protocols differently than blog software, and Windows security holes differently than avionics software, and little side projects differently than TLS implementations. It’s time indeed for the law to catch up, but time also for the rhetoric around these things to be smarter and more careful, lest the laws we end up with only make things worse.

Don't miss what's next. Subscribe to Across the Sundering Seas: