Some notes on code review
A brief personal note
I missed a newsletter last week, due to a combination of procrastination and MIT Mystery Hunt, but also due to adopting this little guy into our home!!!!
His name is Ranger, he turns 9 weeks old tomorrow, and he is so precious and perfect. He has an Instagram if you want more photos.
On code review
Fully a decade ago, I wrote a brief post about code review, in which I shared my enthusiasm for code review in principle, but my frustrations with code review requirements in practice. Since then, I’ve spent a lot more time as a professional software engineer and worked on a lot of different teams and sizes and scales of projects. I thought it was worth writing an updated version of my current thoughts and opinions on code review, what it’s for, and what it takes to make it work well.
On the whole, I would say that these days I love doing code reviews and reviewing code, and find it one of the most fulfilling parts of my job. It’s one of the best ways of really engaging with other engineers to improve our skills and build the best systems we can. That said, I think that my experience with code review goes in many ways contrary to a lot of popular wisdom or perceptions; in particular, I find good code review processes to center the developers and their users much more so than it centers the particular lines of code being reviewed.
What is code review NOT for?
I find it easiest to open with a list of things that I don’t regard as the main goals of code review. I will note that in practice the details of a code review process vary widely with project, and occasionally it is necessary to assign responsibility for one of these considerations to the code review role. But these represent my opinions for the end state for a healthy project with mature tooling and development processes.
Anything that can be automated
In a development process, anything we can make a computer do instead of a human, we should. At a most basic level, this encompasses things like CI: if a proposed change fails to compile or breaks the tests, a computer should determine this automatically and notify the submitter. It shouldn’t be the responsibility of the code reviewer to build and test the code and complain to a submitter when something goes wrong.
Beyond CI, though, this also encompasses a broad range of stylistic and other issues. If your project wants to maintain a consistent formatting style, it should use an autoformatter, which can check for mis-formatted code on every push. If there are common code patterns you want to enforce the use or avoidance of, you should find a linter that can be programmed to detect them. If you require all source code files to have a license comment, CI should check that.
For sufficiently large projects and teams, this philosophy can mean a significant amount of work invested in automated tooling and linting. At Stripe, we had an extensive suite of internal Rubocop lint rules we used to try to push the code base towards uniformity and avoid common pitfalls while taking a burden off of code reviewers.
Investing in automated tooling results in a huge quality-of-life improvement for both authors and reviewers. Authors get faster feedback, since computers never sleep or get distracted, and reviewers get to feel less eagle-eyed about catching every small mistake, and also never have to feel obstructionist for blocking a merge due to a small stylistic complaint. Instead, the conversation can move to other, higher-level topics, like those I’ll discuss in the next section.
Catching every bug
Somewhat relatedly, I don’t think — in general — that the best use of code review is catching bugs in the author’s code. In a healthy development process, reviewers needn’t read each patch painstakingly searching for subtle off-by-one bugs or edge cases.
This isn’t to say that code reviewers won’t find bugs, or that they shouldn’t flag ones they find. A good code reviewer almost inevitably will find bugs, just by virtue of reading the code with fresh eyes and making sure they understand it.
However, while that is a real benefit of code review, I don’t see it as the reviewer’s main purpose. Their greatest contribution is in the ways I outline later, not as a human substitute for adequate testing. Perhaps most critically, I don’t believe the reviewer is responsible for ensuring correctness. If buggy code does make it through to production, we should not put the onus on the reviewer for missing it, but treat them as one aspect of the larger system, and adopt a very wide lens for what we can do differently next time.
What is code review for?
What, then, do I think code review is for in a healthy system? As I mentioned earlier, I think it’s actually less about the code itself, and more about the humans building and using the system under development!
Ensure the code is understandable and maintainable
On nearly any project of any size these days, it is inevitable that multiple developers will need to collaborate on and share responsibility for any system. Often, even, developers who did not write the code themselves will need to be on-call or otherwise bear operational responsibility for its behavior in production.
While a good engineer will try to write code with these goals in minds, we can’t avoid the fact that the author of a system or piece of code will always approach it, to some extent, with an author’s eyes. They will have some natural bias to focus on the work they had to do to write it, and to discount the experience of approaching the code as a reader trying to understand it in order to debug or extend it.
A code reviewer ensures that someone in the development cycle approaches the code as a reader, and can evaluate it for that perspective and advocate for the needs of all future engineers who much approach this code as a non-author of it. If they find something confusing or unclear or tricky, the author should be inclined to trust their judgment and consider suggestions for improvement. There’s a good chance future developers tasked with understanding the system will share the concern.
Creating a shared understanding of the system
The readability of the code is a property of the artifact, of the code and documentation committed by the change. In addition to the code itself, though, code review is also about the mental states and notions of the software system and its environment shared by the developers working on it.
One of my favorite essays about software engineering is Naur’s “Programming as Theory Building.” In it, Peter Naur argues that the act of designing and implementing software systems (“programming”) is best thought of as being an act less of creation of a software artifact, and more of creation of a “theory” — a view of the problem, and of the components of the created system and their organization and function and relation — inside the minds of the programmers building the system. I really like this perspective and find it informative about the process of building software systems, especially of doing so in teams and over extended periods of time.
From this perspective, I view code review as one of the foundational tools we have for aligning theories of the system between all developers on the team. By reviewing and engaging with and discussing each other’s code, developers learn about and influence each other’s theories of the system. By doing so in an iterated fashion, they increase the degree to which every member of the team shares a theory of the system. This in turn helps them understand concrete details and behaviors better, communicate more effectively about future development work and potential changes to the system, and generally work much more effectively as a coherent team.
Reviewing process and purpose
Code review, despite the name, is not just about the code! Perhaps the most important jobs of a reviewer are to review the work that happens around and before actually writing code, and understand and evaluate the author’s decisions in those parts of the change. Here are two major areas I think about under this rubrik.
What is this change trying to accomplish, and how and why?
What bug or feature is this code change addressing? Is it addressing it in a way that makes sense — e.g. is the benefit commensurate with the complexity or other costs added? Is this code change actually addressing the problem it set out to solve?
Some of the most valuable code reviews are the ones which result in a change never shipping, or getting completely rewritten. A reviewer who can point out an alternate solution or a simpler workaround for the problem may leave the system much better-off than any number of smaller fixes to the implementation would.
This role needs to be taken on with care, of course, and can veer into stop energy if used carelessly or maliciously. But in a healthy project with a shared vision and sense of values and high-level priorities, a reviewer who can tactfully and appropriately ask “Is this really what we should be working on?” or “Is this the right approach at all?” can be invaluable.
Of course, this kind of feedback is usually best given earlier in the process, suggesting the need for and value of review early in the process, not just at the 11th hour. But sometimes we can’t appreciate the full complexity of a feature or its unexpected interactions until we’ve done out the actual implementation details. Even if a change was already approved in principle, reviewers should be prepared — but never eager — to push back on it in its entirety.
Why is the author confident in the change?
What is the test plan? Did the author test this change manually, with existing automated tests, by writing new tests, or by inspection and thinking hard? If this change can’t be tested at all until production, what metrics or indicators will we look at during a deploy? What is the worst case if the change goes wrong? Are we prepared to roll back or revert?
No matter what answers the author has, the code reviewer should evaluate them and ask themselves how they would want to test this change, and whether the test plan seems adequate to the nature of the change and the system. If the author has written unit tests, the reviewer should review them and decide whether they are enough tests or appropriate tests. If the reviewer has tested by hand, the reviewer should understand what tests they ran, and agree that testing by hand is appropriate.
As a concrete example, suppose I’m reviewing some piece of code and suspect it of having an off-by-one bug. Instead of sitting down and puzzling out the indices to decide whether it has a bug or not, I will check to see if there is a test case demonstrating that it works in the edge case of concern. If there isn’t, I’ll suggest adding one. What I’m doing here, then, is not merely about spotting a potential bug, but instead making sure that the change and the code itself present a convincing argument that it is adequately correct, and refining or improving that argument.
Here, also, we find the exception to the “code review is not about finding every bug” point from earlier. Sometimes it really is the case that — due to a lack of testing infrastructure, challenging dependencies, or just out of expediency — we can’t adequately test code, and have to resort to “being very careful.” In those cases, it’s the reviewer’s job to go through the code at least as carefully as if they were the author themselves, backstopping the author’s careful analysis with their own validation.
Learning and teaching
Code review is one of our best opportunities as professional engineers for both learning and teaching our craft, and getting a little better at it every day. Reading other developers’ code and communicating with them about it exposes us to new knowledge and approaches, which we can adopt and incorporate into our own practice.
This kind of learning can take all sorts of forms. We can learn new APIs or language features we didn’t know, new patterns for building systems, new “gotchas” or obscure corners of the language. We also sometimes just gain unexpected insight into the shape of a problem or how someone thinks about it; we might discover patterns that we knew, but applied in a surprising way to solve a problem.
Learning can happen in both directions — the author can learn from the reviewer, and the reviewer can learn from the author. Additionally, while it will be common for a more-junior engineer to learn tricks or techniques from a more-senior engineer, you’d be surprised how often it can go in the other direction! Seniority is not a monotonic ladder of knowledge, and junior engineers can often have things to tech longer-tenured team members. One particularly common place I’ve seen this is that junior engineers may be more familiar with newer tools or frameworks or processes, having just learned them. They can point out tools that didn’t exist when a more senior engineer was learning a framework or learning an organization’s stack, helping tenured engineers keep their toolkit fresh.
Junior engineers can also help senior engineers see their own code through the eyes of someone new to the codebase! As a senior engineer on a mixed-seniority team, part of your job is absolutely to make sure that the system is usable and understandable by everyone on the team, not just those with deep understanding of every piece of arcana. Junior engineers asking “Hey, I don’t understand what’s going on here, can you explain it?” serve a valuable purpose in flagging places of potentially-unnecessary complexity or obscurity, and sometimes even help the author realize they were over-complicating a problem that could be solved in a much simpler fashion.
Conclusion
As I’ve gained experience and worked on larger systems, I find that my perspective has shifted away from the actual lines of code that we write, and more to the processes and structures that write that code, and the ideas and models that inform it. My opinions on code review share in this trend.
I’m also aware that what I describe here is a bit of an idealized vision, that glosses over a lot of implementation details and the details of any particular organization. I’m really not a believer in The One Perfect Software Engineering Paradigm, so I would expect what I say here to vary a lot in the details between organizations, even successful and healthy ones.
Additionally, while re-reading my old post on code review, I was fascinated to find that my 10-year-younger self expressed a number of similar ideas, but focused on my frustrations in practice with the delay and costs to focus and job satisfaction I experienced from mandatory code review policies. These are super real, and while I believe that healthy organizations can mitigate these quite a lot with good organization design and practices, I’ve largely glossed over those kinds of operational details here. Perhaps I’ll cover them in a future post!