NULL BITMAP by Justin Jaffray logo

NULL BITMAP by Justin Jaffray

Subscribe
Archives
November 17, 2025

Ordering and Partitioning

NULL BITMAP.png

I've recently been downsizing my Magic: The Gathering collection to cards that I really want to keep. I think a "collectible card game" is a good way to indulge a lot of my worse packratting tendencies, and living in a small New York apartment is not really all that great of a situation for being a hoarder. But it's a good impetus for me to ask which of these cards I actually want to continue to own.

When you sell Magic cards to an online store they want them sorted in a particular way (or, some will offer a reduced purchase price for unsorted cards). Having played Magic for like, six years at this point, I have developed a pretty good skill at sorting fistfuls of cards by colour and then alphabetical, which is the way that individual sets are usually sorted. Given a stack of cards all of the same colour I can do a pretty decent job of quicksort partitioning it alphabetically (until the piles get small enough that I can just do insertion sort).

I'm less good, however, at the ordering that the vendor I'm planning to sell my cards to wants, which is, predominantly, ordered by set name alphabetically.

So, my strategy for this has been largely to find something to partition the cards on (stuff like "did this card come from a set before or after I started playing Magic") until the piles were small enough to bucket sort them, and make a pile for each category, which in this case was the sets. At this point, I don't really have an easy way to order the sets: they'll need to be ordered alphabetically eventually but it's kind of a pain.

It occurred to me that I still had done something sort of useful and interesting just by doing this grouping though: even though the sets aren't ordered, they are grouped: there are no two cards from different sets that are not in the same group.

This is a concept that comes up in query planning, that of partitioning vs. ordering.

Given a set of columns in a relation, a peer group is a set of rows that all have the same values for those columns. A list of the rows in a relation is partitioned if all of the peer groups are clumped together.

If we have this relation:

A | B
--+--
1 |20
2 |20
1 |10

This order of rows is partitioned on {B}, because the two peer groups, 10 and 20, are all clumped together. It's not partitioned on A because the 1 peer group is split up. Perhaps interestingly, it is partitioned on {A,B}, because every peer group is trivial.

A list of rows is ordered on a list of columns if it's sorted lexicographically on those columns, according to some definition of how the various values in the columns are to be ordered. One thing that is fairly obvious is that ordering is a stronger property than partitioning: a list of rows that's ordered on a list of columns is also partitioned on that set of columns.

In a lot of cases, query engines eschew caring about partitioning outside of something like window functions and rely on just ordering. There's a well known technique for optimizing memory use for aggregations where we rely on the fact that our data must be ordered. Rather than build a hashmap over our groups, we just rely on the fact that ordering induces a partioning and we can just aggregate each group as we see them.

This uses a bit more power than is necessary though, to do this kind of streaming group-by, we don't actually need our data to be ordered, we only need it to be partitioned: the important property is that all the peer groups for the grouping columns are together.

Unfortunately I think it's not really practical to rely on this kind of partitioning rather than ordering in most real-world cases because in order to keep data partitioned we need to easily be able to find the relevant groups, and the easiest way to do that is for everyone to agree on an ordering of the data.

Don't miss what's next. Subscribe to NULL BITMAP by Justin Jaffray:
Start the conversation:
GitHub Bluesky X
Powered by Buttondown, the easiest way to start and grow your newsletter.