New Essay and a Very Peculiar Optimization
New Essay
J Notation as a Tool of Thought. Many of you already read it as a newsletter piece, but this new version is considerably expanded and has a lot of new stuff. Check it out!
There's one thing the newsletter had that
the blog essay does not: When talking about
grades in the old version I introduced something called
the "anagram index".
Each permutation vector of length N can be mapped to
a single number, it's anagram index,
between zero and N!
.
A. 0 1 2 3 4
0
A. 0 1 2 4 3
1
A. 1 0 4 2 3
28
It's neat but I didn't want to include it unless I had a use case. The good news is I found a use case. The bad news is it takes way too long to explain for an already 2000+ word blog post. So instead I'm dumping it here. Yay!
Storing Sorts
Say you have a list of a thousand elements. You want to sort it in a bunch of different ways to see different kinds of data, and you also want a history of how you sorted it. Recalculating the sort each time you go back in history is CPU-inefficient. Duplicating entire data in each history view is memory-inefficient: if each element is one kilobyte, that's a meg of memory per history state.1
A better alternative to both is
storing the permutation vector
for each sorting.
That would be faster than recalculating the sort,
as it is only O(n)
instead of O(nlog(n))
.
It also uses less space than duplicating the array: each number in the vector is
a single integer,
which in most machines is four bytes.
So each sorting is only 4 kB.
But we do better than that, and
here's where things get weird.
As mentioned,
each permutation maps
to an anagram index.
If we instead store the anagram index
instead of the full permutation vector,
we just need enough space to
store a single number, which is
1000!
in the worst case.
Now that's a number with over 2500 digits and it seems like storing that would be less efficient than
just storing the vector.
The number of bits we need is log2(n!)
, a formula that comes up often enough we have a robust approximation for it. According to Stirling's approximation, log2(n!) ≈ n *( log2(n) - log2(e))
. In J, that is
NB. log2(e)
log2e =: 2 ^. 1x1
NB. Stirling's approximation
sta =: (* 2&^.) - *&log2e
NB. switchin' highlighters
(sta 1000); (2 ^. ! 1000x)
┌───────┬──────┐
│8523.09│8529.4│
└───────┴──────┘
For a thousand element list, the approximation is only off by about six bits. Rounding up the nearest byte, we can store a single permutation in:
NB. bytes is ceil(sta % 8)
bytes =: >.@(sta % 8:)
bytes 1000
1066
Just over one kilobyte. That's almost a 4x improvement on storing the entire permutation vector.
There's a couple of reasons we are able to be so much more efficient. The first is that the anagram index is a little bit denser than the permutation vector: we are storing the p-vector as an array, meaning our storage mechanism isn't "aware" that every element of the array is unique. That makes the representation a little bit less dense. The other reason is that we are cheating like crazy. We are rounding up the nearest byte for the anagram index, while for the p-vector we are always using four bytes. We only need one byte to store the first 256 indices and two bytes to store the first 65,000.
If we wanted
to be as space-efficient as possible,
a better approximation
for the permutation vector of length n would be
n*ceil(log2(n)/8)
.
This approximation is
most inaccurate when we are
just above a cutoff
and most accurate when
just below one.
For a 1000 element array, we're looking at
2000 bytes for the permutation vector: still higher, but dramatically so.
Is storing the p-vector ever more efficient? The use of ceilings here makes math tricky, but we can still do some rough calculations. We want some n for which
ceil(n/8 (log2(n) - C)) > n*ceil(1/8 log2(n))
Where C = log2(e)
. Notice that n*ceil(x) > ceil(n*x)
, as the former "amplifies" the ceiling. Also notice that as n → ∞, ceil(x) → x
, as rounding up 1.1 to 2 is a much bigger deal than rounding up 100.1 to 101. So we can make the two simplifications:
ceil(n/8 (log2(n) - C)) > ceil(n/8 log2(n))
n/8 (log2(n) - C) > n/8 log2(n)
-C > 0
Which is false for all n. This should make us suspect that the anagram index is always more efficient. Suspect, not know, because both sides were approximations. But both sides get progressively more accurate as n gets higher, so I'm inclined to think it's a safe assumption.
A bunch of math geekery
Note that if we did find a solution after our simplifications, we'd still have to test if it was an actual solution! The replacement of n*ceil(x)
with ceil(n*x)
reduces the right-hand side, giving us an easier milestone to cross. Since there wasn't a solution to the weakened version, there isn't a solution to the stronger version. But if we found a solution to the weaker version, we would still need to test it against the stronger version.
On the other hand: if we used the "less efficient" permutation representation earlier, 4 bytes per index, we get a valid solution!
ceil(n/8 (log2(n) - C)) > 4n
log2(n) - C > 32
n > 2^(32 + C)
This is because "4 bytes per index" only works if you have less than 2^32
elements. After that you need five bytes per index. Our naive representation overestimated the necessary space for small arrays, but underestimates it for large arrays.
Discussion
All of this is completely pointless because for small arrays the storage improvement is negligible and for large arrays you run into major implementation problems. Storing bignums with thousands of digits is going to wipe any efficiency gains. Inverting the anagram indices of bignums is going to wipe any efficiency gains.
Anyway that's why I left it out of the blog post: it took a thousand words to explain an extremely niche theoretical optimization that doesn't actually work in practice. But I had a lot of fun thinking about it and hope you had fun reading it!
-
Approximately. We'll say that
1024 = 1000
to avoid worrying about kilo- vs kibi- confusion. ↩
If you're reading this on the web, you can subscribe here. Updates are once a week. My main website is here.
My new book, Logic for Programmers, is now in early access! Get it here.