Why Not Comments

like

                September 10, 2024

            Why Not Comments

                Why not "why not" comments? Not why "not comments"

            Logic For Programmers v0.3
Now available! It's a light release as I learn more about formatting a nice-looking book. You can see some of the differences between v2 and v3 here.
Why Not Comments
Code is written in a structured machine language, comments are written in an expressive human language. The "human language" bit makes comments more expressive and communicative than code. Code has a limited amount of something like human language contained in identifiers. "Comment the why, not the what" means to push as much information as possible into identifiers. Not all "what" can be embedded like this, but a lot can.
In recent years I see more people arguing that whys do not belong in comments either, that they can be embedded into LongFunctionNames or the names of test cases. Virtually all "self-documenting" codebases add documentation through the addition of identifiers.¹
So what's something in the range of human expression that cannot be represented with more code?
Negative information, drawing attention to what's not there. The "why nots" of the system.
A Recent Example
This one comes from Logic for Programmers. For convoluted technical reasons the epub build wasn't translating math notation (\forall) into symbols (∀). I wrote a script to manually go through and replace tokens in math strings with unicode equivalents. The easiest way to do this is to call string = string.replace(old, new) for each one of the 16 math symbols I need to replace (some math strings have multiple symbols).
This is incredibly inefficient and I could instead do all 16 replacements in a single pass. But that would be a more complicated solution. So I did the simple way with a comment:
Does 16 passes over each string
BUT there are only 25 math strings in the book so far and most are <5 characters.
So it's still fast enough.

You can think of this as a "why I'm using slow code", but you can also think of it as "why not fast code". It's calling attention to something that's not there.
Why the comment
If the slow code isn't causing any problems, why have a comment at all?

Well first of all the code might be a problem later. If a future version of LfP has hundreds of math strings instead of a couple dozen then this build step will bottleneck the whole build. Good to lay a signpost now so I know exactly what to fix later.
But even if the code is fine forever, the comment still does something important: it shows I'm aware of the tradeoff. Say I come back to my project two years from now, open epub_math_fixer.py and see my terrible slow code. I ask "why did I write something so terrible?" Was it inexperience, time crunch, or just a random mistake?
The negative comment tells me that I knew this was slow code, looked into the alternatives, and decided against optimizing. I don't have to spend a bunch of time reinvestigating only to come to the same conclusion. 
Why this can't be self-documented
When I was first playing with this idea, someone told me that my negative comment isn't necessary, just name the function RunFewerTimesSlowerAndSimplerAlgorithmAfterConsideringTradeOffs. Aside from the issues of being long, not explaining the tradeoffs, and that I'd have to change it everywhere if I ever optimize the code... This would make the code less self-documenting. It doesn't tell you what the function actually does.
The core problem is that function and variable identifiers can only contain one clause of information. I can't store "what the function does" and "what tradeoffs it makes" in the same identifier. 
What about replacing the comment with a test. I guess you could make a test that greps for math blocks in the book and fails if there's more than 80? But that's not testing EpubMathFixer directly. There's nothing in the function itself you can hook into. 
That's the fundamental problem with self-documenting negative information. "Self-documentation" rides along with written code, and so describes what the code is doing. Negative information is about what the code is not doing. 
End of newsletter speculation
I wonder if you can think of "why not" comments as a case of counterfactuals. If so, are "abstractions of human communication" impossible to self-document in general? Can you self-document an analogy? Uncertainty? An ethical claim?

One interesting exception someone told me: they make code "more self-documenting" by turning comments into logging. I encouraged them to write it up as a blog post but so far they haven't. If they ever do I will link it here. ↩

            If you're reading this on the web, you can subscribe here. Updates are once a week. My main website is here.
My new book, Logic for Programmers, is now in early access! Get it here.

    Read more:

                Comment the Why *and* the What
                People say "comment the why, not the what", the idea being that the code should be self-documenting and the comments should only be a last resort for...

Don't miss what's next. Subscribe to Computer Things:

Join the discussion:

            Devam Manke

                Sep. 11, 2024, morning

I can't store "what the function does" and "what tradeoffs it makes" in the same identifier.

Why not?

        Reply
Report

            Glenn

                Sep. 18, 2024, morning

    Why would you even want to???  Comment syntax exists for a reason.

        Reply
Report

            Vic

                Sep. 11, 2024, morning

    Comments explain 'why' are normally good and necessary. Either why or why not.

        Reply
Report