Teaching Accidental and Essential Complexity

formal specification

                October 18, 2022

            Teaching Accidental and Essential Complexity

            So you may know that I'm teaching a 1-day TLA+ workshop in December (just 10 slots left!) This is unlike my normal workshops because it's only 1 day and for 35 people instead of 3.¹ The two workshops share almost no content between them. To understand why, I need to go into a bit of teaching theory.
For the 700 or so new readers, TLA+ is a form of formal specification language. You can make a design for a system and then directly test the design itself for bugs. It's an extraordinarily powerful technique but also notoriously difficult to learn. There are two reasons why. First, TLA+ involves a lot of skills and concepts that people aren't familiar with, like formal logic and model checking. These concepts are all tightly interrelated and you need to understand them all to make TLA+ useful. This is essential difficulty: there's no easy way to simplify the language while maintaining its power. Second, the tooling is very unforgiving, and there's lots of unclear error messages and footguns. This is accidental difficulty: there's only a couple people working on the main tooling and they haven't had the time or resources to make things better.
I can tailor my workshops to focus on one of these two. To reduce the conceptual difficulty, I can "scaffold" the teaching, or introduce each topic in isolation and gradually build up the whole language. In TLA+, this is done by using a special language, called PlusCal, that compiles to TLA+. PlusCal is much easier for people to grok: 
\* TLA+

\E s \in Servers:
  /\ status' = [status EXCEPT ![s] = TRUE]

\* PlusCal
with s \in Servers do
  status[s] := TRUE;
end with;

The tradeoff is that PlusCal adds a lot more accidental complexity. People now have to learn two syntaxes and learn how to use the PlusCal compiler, which is as unforgiving as the rest of the TLA+ tooling.
Normally, I think that's a worthwhile tradeoff: it's more important to reduce the essential difficulty of the material than the accidental difficulty. And for my small workshops, it works very well. 
Accidental Difficulty Doesn't Scale
Here's why: when someone runs into a tooling issue, the only way to resolve it is to walk over, look at their screen, and fix the mistake for them. This works for 4 students.
It does not work for 35.
To scale the workshop, I needed to reduce the tooling friction as much as possible. So no PlusCal, everything has to be done in pure TLA+. Which means the students now face all of the essential complexity immediately. In TLA+, all the concepts are very tightly bound. Here's the "canonical example" in Lamport's book:
---- MODULE HourClock ----
EXTENDS Naturals
VARIABLE hr
HCini == hr \in (1 .. 12)
HCnxt == hr' = IF hr # 12 THEN hr + 1 ELSE 1
HC == HCini /\ [][HCnxt]_hr
----
THEOREM HC => []HCini
=====

What the heck is hr'? What's HCini /\ [][Hnxt]_hr mean? What's HC => []HCini? This isn't even a nondeterministic specification, much less a concurrent one!
Scaffolding is still necessary, just way harder when I can't rely on PlusCal. I spent like a month iterating on different progressions before settling on one that I thought worked. One thing that helped here: using an example that bootstrapped off an existing mental model people had.
Meatspace modeling
I've talked about this idea before. As a refresher: when you use analogize a concept to the real world situation, people can use their existing mental models to work through the nuances, making it easier to internalize.
For example, imagine a bunch of people all trying to find seats in an auditorium. Each person is nondeterministic: they can pick whatever seat they want, as long as it's not occupied. The system is concurrent: people are making decisions independently of each other and don't pick seats in a fixed order. With enough finesse, we can also intuitively introduce concepts like stuttering, fairness, and even refinement. Certainly beats the hour clock as the starting example.
I also found that the corresponding specification can nicely be translated to one on threads and semaphores, showing that people are, in fact, learning something useful, as well as that a high level the spec represents an abstraction— converting "sitting and standing" to "use a semaphore" is just a matter of changing all the operation names.
tl;dr

The difficulty in a subject can depend on both difficulties innate to the domain and incidental difficulties
You may have to prioritize one over the other for a given lesson
The structure of the workshop can affect which is the right choice to prioritize
You can reduce impact of the other with hard work and clever lesson design
You should totally sign up for my workshop to see this principles in action

The constraints are because I first ran it at Strangeloop. People seemed to really like it! ↩

            If you're reading this on the web, you can subscribe here. Updates are once a week. My main website is here.
My new book, Logic for Programmers, is now in early access! Get it here.

Don't miss what's next. Subscribe to Computer Things:

Start the conversation: