Space travel nostalgia, frugality and the XOR doubly linked list
Hello friend,
This is the 3rd edition of our hackerclub.io newsletter. In this one, we visit the exoteric XOR doubly linked list data structure and take weak stabs at humor. If you prefer to read it on your web browser, please use this link.
Space travel nostalgia, frugality and the XOR doubly linked list
Yes, yes, we are all very aware we put men on the moon with less computing power than we carry in our pockets. Lucky for us, it's been fifty years and spaceships can now afford to run JavaScript, thank you very much. Of course, whether you would rather board a rocket powered by hand-written assembly and magical wires or by zillions of C++ lines is a rather hot dinner party topic.
Either way, 1969's future has arrived, and if you overlook our constant anxiety, insufferable political polarization, a global pandemic and the general state of the world, things have never been better: computer RAM, for instance, has gotten very cheap.
Having inexpensive transistors means many of us don't need to optimize our programs to death all the time. We can get away with a little byte bloat and a little runtime inefficiency, if it means being more productive and having nicer abstractions to rely upon. We have automatic garbage collection, and we avoid reinventing the wheel by pulling (sometimes a lot) of third party libraries, even if we have to import the whole pork to make a single bacon call.
But cheap silicon also means that we start sprinkling the holy sand in new places: we have Wi-Fi enabled light bulbs, light switches, thermostats, sensors, fitness trackers. In these devices, memory and CPU are usually at a premium and so is energy, as many operate on batteries. In the embedded and internet of things worlds, being frugal pays off.
In this spirit of warm nostalgia and embedded frugality, we'll discuss a somewhat exoteric and intriguing data structure that provides an interesting space-time tradeoff: the XOR doubly linked list. If we do things right, we can halve the space requirements for pointers in a doubly linked list -- but at what cost?
As usual, we propose you get your hands dirty with a programming challenge by the end of the post.
A regular doubly linked list
Linked lists are a huge deal. They are used everywhere, from the linux kernel to the custom memory allocator in the original Doom game and everywhere in between.
struct Node {
int value;
Node *prev;
Node *next;
};
Stripped to the bare minimum, a node in a doubly linked list has an associated value and two pointers: prev
points to the node before it, next
points to the next node in the sequence.
Figure 1. Example of a regular doubly linked list with three nodes
A chain of nodes represents an ordered sequence of values. Traversing all nodes can be done by following each node's next
pointer. An example of this traversal is the PrintForward
function:
void PrintForward(Node *node) {
for (; node != nullptr; node = node->next) {
std::cout << node->value << "\n";
}
}
Analogously, we traverse it "backwards" by following the prev
pointer instead.
Populating the list is done iteratively, by appending one node at a time. Here's a possible implementation of the Append
function:
// Appends a new node with value `value` after `node`.
Node* Append(Node *node, int value) {
Node *new_node = new Node{value, node, nullptr};
node->next = new_node;
return new_node;
}
You can play with a full example in this repl.it we set up for you.
Like... just use an array?
You may be excused to wonder why we should go through all this trouble just to store a sequence of values. We could simply use an array and store all values in a contiguous chunk of memory - no pointers. In reality, arrays and linked lists offer different tradeoffs for common operations.
We will visit some of these tradeoffs in the programming challenge, but for now, you might want to think about the procedures for:
1. Accessing the kth
element
1. Appending a new element
1. Removing the kth
element
for both an array and a linked list.
The XOR doubly linked list
Surprisingly, there is a way of making a doubly linked list Node
using a single pointer, as opposed the our previous implementation that had two (prev
and next
).
struct Node {
int value;
Node* xaddr;
};
The trick is to store the XOR (exclusive OR - usually represented as operator ^
) of the prev
and next
nodes addresses into a single variable, xaddr
.
To make matters concrete, let's start with a single node n_0
, whose value
is set to 0
. Since it's the single node in the list, both its previous and next nodes are null
, which we interpret as 0
. By definition, then, the single node's xaddr
takes the value of null ^ null = 0 ^ 0 = 0
.
Figure 2. A single initial node in an XOR doubly linked list
Let's now append a new node at the end of the sequence. The new node, n_1
, comes right after n_0
. Here, I'm using the C notation &x
to refer to the address of x
. Note how the nodes' xaddr
s are updated:
Figure 3. A XOR doubly linked list with two nodes
Adding a third node after n_1
is more of the same:
Figure 4. A XOR doubly linked list with three nodes
Generalizing this procedure, whenever we append a new node n_k+1
after n_k
, we update n_k.xaddr
by XOR
ing it with n_k+1
address. In turn, n_k+1.xaddr
takes the value of &n_k
. Here's a possible implementation of Append
:
// Appends a new node with value `value` after `node`.
Node* Append(Node *node, int value) {
Node *new_node = new Node{value, node};
node->xaddr = XOR(node->xaddr, new_node);
return new_node;
}
Traversing an XOR doubly linked list
With our shiny XOR doubly linked list in hands, we turn to how to visit every value in the sequence.
Let's zoom in on a generic node somewhere in the middle of the list:
Figure 5. A node
n_k
in the middle of an XOR doubly linked list
To find the address of the next node in the sequence, &next
, we need to somehow extract it from n_k.xaddr
, essentially undoing the XOR between &prev
and &next
.
An interesting property of the XOR is that it is its own inverse: (a ^ b) ^ b = a
. In other words, if we had the address of the previous node, &prev
, we could extract the address of the next node, &next
, from n_k.xaddr
:
&next = n_k.xaddr ^ &prev
By definition, in general, we don't have the address of the previous node. Except, not surprisingly, for the first node in the sequence, n_0
:
Figure 6. The first node n_0
an XOR doubly linked list
We can use this knowledge and the address to n_0
to iteratively hop from one node to the next:
&n_1: n_0.xaddr ^ null = n_0.xaddr ^ 0 = n_0.xaddr
&n_2: n_1.xaddr ^ &n_0
&n_3: n_2.xaddr ^ &n_1
...
&n_k+1: n_k.xaddr ^ &n_k-1
This procedure is encoded in the following Traverse
function:
// Traverse all nodes in an XOR doubly linked list.
// `node` is assumed to be the first node in the list.
void Traverse(Node *node) {
Node *tmp, *prev = nullptr;
while (node != nullptr) {
std::cout << node->value << "\n";
tmp = node;
node = XOR(prev, node->xaddr);
prev = tmp;
}
}
We set up a runnable example with the Append
and Traverse
functions in this repl.it.
Programming challenge
- Implement the delete operation for both the regular and XOR doubly linked list. The function signature is
Delete(Node *node, value)
, and it should remove the node with valuevalue
from the list.node
points to the first element of the list (the head). You may use these repl.its as a starting point: regular, XOR.- In his TED talk, Linus Torvalds use the delete operation in a linked list to exemplify the subjective "good taste" in code. His positive example is interesting on its own, and it is discussed in detail here. Implement the elegant
Delete
operation.
- In his TED talk, Linus Torvalds use the delete operation in a linked list to exemplify the subjective "good taste" in code. His positive example is interesting on its own, and it is discussed in detail here. Implement the elegant
- Prove that XOR is it's own inverse. TIP: Consider all possible combinations of two bits (
a
,b
) and see if you can exhaustively show that(a ^ b) ^ b = a
. - The XOR doubly linked list saves up memory in comparison with the regular doubly linked list, at the cost of sacrificing some common operations. To understand this tradeoff, complete the following table with the Big-Oh time complexity of the referenced operations. If an operation is impossible, mark it so.
Regular doubly linked list | XOR doubly linked list | |
---|---|---|
Finding an element given the list head | ||
Removing an element given the list head | ||
Finding the next node given a pointer to a node in the middle of the list |