Documentation could be so much better

physical

                        November 10, 2021

            Documentation could be so much better

                    A year or so back I wrote the newsletter post Don't use Markdown for documentation, where I complained about its lack of extensibility and cross-project referencing. People's critique fell along two lines:

"Everybody on the team already knows and is willing to write markdown": fair enough— I mostly care about dense, complex technical writing by an enthusiastic writer.  If you need everybody to pitch in, you shouldn't force them to learn a completely new technology. 
"Markdown does everything I need", aka "better things aren't possible."

Documentation is bound to our idea of text, in particular physical text. A book is a single static thing, it cannot change to fit the needs of the user. Hypertext is a step beyond that: we can link references and ctrl-F.
Past that we… stop. We write the documentation as if it's just a bunch of unstructured text with hyperlinks in it. Any markup is just there for formatting. We wrap code in ```codeblocks``` in order to highlight and monospace something, not because the text should "know" it's a codeblock.
Any human text is going to be a curious mix of partially structured data, and documentation is no exception. Skimming some books I've got, I see a lot of structure:

Terms and definitions
Exercises and solutions
Admonitions: tips, warnings, notes, etc.
References
Code snippets
Code listings
Troubleshooting guides¹

There's also higher order organization, like introductions, summaries, and examples. All of this is semantic information we can work with. Take definitions. In most cases they're defined in one place, and if you're lucky there's also a glossary. But there's no reason we couldn't have it as a popup wherever it appears in the text. Hover over the term and it shows a definition.
Pyret explores a similar idea. They draw connections between the jargon they use and the code they correspond to. Then they can highlight both with the same colors, which helps students internalize the connections.

This is why— when I don't have to collaborate with others— I prefer to write documentation in reStructuredText. I can mark terms inline like :term:`this` . rST is a lot more heavyweight than markdown and has some really annoying rough edges, but it also has a lot more power.
We can also encode intent into the documentation. We're already used to putting auxiliary information in admonitions to say it's useful but inessential. The problem is that historically, we cannot put too much information in admonitions because it distracts from the main text. Every tangent takes up valuable space and distracts the reader. With semantic information we can make it expandable, or hide it entirely for beginners. In rST, that could look like
some core information

.. advanced::
  A complex section 
  useful for going further
  but not for newbies

Then we can hide that section in a <details> tag and only show it when the user intentionally clicks it.
Another idea: instead of seeing the text as either linear or random-access, we can represent it as a DAG. The nodes are topics, the edges are knowledge dependencies. Each part of the text can list its dependencies:
Generators
----------

.. topic:: generators
  :covers: yield
  :requires: lists, functions, iterables

If the reader is reading linearly, we can guarantee on build that every topic is only introduced when all of the previous requirements were covered. If the reader is doing random access, we can have each topic list the requirements and link to both the main text and a reference.
To be clear, none of these are radical. They push the boundaries on what we do but not what we can do. And there's plenty of other low hanging fruit, too, this is just what I thought on the spot. They all are, though, things that we cannot do without semantic information, which means rethinking how we write documentation.
Layouts could be better
While I'm here, I've always thought that the way we present documentation webpages is really limited. It's just flat HTML pages linking to other pages. Microsoft HTML Help was way ahead of that in the 90's:

Hierarchical content, keyword index, and search, in a different frame than the content. You can click a link and change the content without losing your place in the navigation. And because it's all loaded at once, there's no latency at all.
HTML Help isn't a great system, just a better one than what we currently have. A couple improvements I thought of: first, having a history stack of navigated pages. Something that makes it easy to jump multiple steps backward and forward. Second, having two panes for content, a "main" pane and an "aux" pane. Footnotes, definitions, and references open in the aux pane, so you can follow something without losing your place. 

Anyway, documentation is bad, and it could be so much better. My problem with markdown is that it limits what we can do, so we don't try to do anything new and exciting. This makes things much harder to learn than they need to be.
(Sorry this post was late, massive writers block for the early week and I figured you'd all prefer late than bad.)

Alloy 6
Alloy 6 is now out! It adds temporal logic operators, something Alloy's sorely needed. I don't have an opinion on it yet, but I'm playing with it so I can update the alloydocs. Hope to have a writeup by the end of December!

Okay this one comes from a cookbook, but more programming books should have troubleshooting sections. ↩

                    If you're reading this on the web, you can subscribe here. Updates are once a week. My main website is here.
My new book, Logic for Programmers, is now in early access! Get it here.

Don't miss what's next. Subscribe to Computer Things:

Start the conversation: