When Compositionality Fails

compositionality

                March 25, 2024

            When Compositionality Fails

            The idea of abstraction is that we can take some complex thing and present it as some simpler thing. Where people can ignore the aspects of the thing not relevant to what they're actually trying to do and only focus on the parts that are.
One of the tools we often use to accomplish this is compositionality. That we can build up the complex thing by combining a bunch of smaller things. We see this all the time: I import a bunch of libraries to make an application, the user of said application doesn't have to know anything about the libraries. I combine a bunch of parser combinators to build a parser where you can't actually observe the individual combinators any more.
I visualize this as a kind of opaque bubble wrapping up the things that we're combining in some way.

Once we've wrapped things up into a bundle, the new thing is all that exists, the concept of the interior things fades away and is only known to the implementer. 
A lot of the problems we tend to describe as hard—in various senses of the word—we do so precisely because they are resistant to being solved compositionally. 
Distributed Systems
The classical example of a system that (necessarily, not simply because of an inadequate abstraction!) resists abstraction is a distributed system. The existence of partial failure means that a user can observe the system in a degraded state where it still sort of works. This is unavoidable in a world where we want the positive aspects of a distributed system: fault tolerance, scalability, the ability of teams in a company to operate within their own fiefdoms (to some extent).
Even in systems that work very hard to "provide the appearance of a single-node system," like canonically consensus systems, there are still unavoidably situations that reveal to the outside world the fact that they are interacting with a distributed system: if I'm simultaneously (even indirectly) communicating with two sides of a network partition, then I can tell that only one half of the system is able to make any progress or do any work.
NP-Hard Problems
To me, NP-Hard problems are sort of a poster child for this kind of resistance as well. Take the case of join-ordering. Joins, as with every kind of relational algebra operation, are closed: a join is relations in, relations out. If I have two views, A and B, which are each defined as a join of some set of tables, and then I join A and B, it's entirely possible to compute that result without knowing anything about the definition of A and B. This is great! This is one of the big wins of the relational model: it gives us a universal representation that allows us to build abstractions over data. 
However! Compositionality of the join computation doesn't confer compositionality of the optimality of the plan. If we have an optimal plan for A, and an optimal plan for B, the plan A join B is not necessarily optimal:

To find an actually optimal join plan, we would have to decompose A and B into their constituent inputs, and then build an entirely new plan that is aware of all of them together.
This is a violation of compositionality: someone has to know how A was defined in order to use it effectively.
A lot of problems that are "hard" can be explained as such by thinking about the ways they resist abstraction and simplification. I think this model is helpful in explaining non-technical concepts as well. Why do companies have "skip-level" meetings, where your manager's manager talks to you? It's because a manager is not an airtight abstraction for the organization underneath them: to get an idea of what's going on, you have to peek behind the curtain (even though in many cases it's very useful to be able to provide that curtain).

Don't miss what's next. Subscribe to NULL BITMAP by Justin Jaffray:

Start the conversation: