hackerclub.io

Subscribe
Archives
January 7, 2021

Space travel nostalgia, frugality and the XOR doubly linked list

Hello friend,

This is the 3rd edition of our hackerclub.io newsletter. In this one, we visit the exoteric XOR doubly linked list data structure and take weak stabs at humor. If you prefer to read it on your web browser, please use this link.


Space travel nostalgia, frugality and the XOR doubly linked list

Yes, yes, we are all very aware we put men on the moon with less computing power than we carry in our pockets. Lucky for us, it's been fifty years and spaceships can now afford to run JavaScript, thank you very much. Of course, whether you would rather board a rocket powered by hand-written assembly and magical wires or by zillions of C++ lines is a rather hot dinner party topic.

Either way, 1969's future has arrived, and if you overlook our constant anxiety, insufferable political polarization, a global pandemic and the general state of the world, things have never been better: computer RAM, for instance, has gotten very cheap.

Having inexpensive transistors means many of us don't need to optimize our programs to death all the time. We can get away with a little byte bloat and a little runtime inefficiency, if it means being more productive and having nicer abstractions to rely upon. We have automatic garbage collection, and we avoid reinventing the wheel by pulling (sometimes a lot) of third party libraries, even if we have to import the whole pork to make a single bacon call.

But cheap silicon also means that we start sprinkling the holy sand in new places: we have Wi-Fi enabled light bulbs, light switches, thermostats, sensors, fitness trackers. In these devices, memory and CPU are usually at a premium and so is energy, as many operate on batteries. In the embedded and internet of things worlds, being frugal pays off.

In this spirit of warm nostalgia and embedded frugality, we'll discuss a somewhat exoteric and intriguing data structure that provides an interesting space-time tradeoff: the XOR doubly linked list. If we do things right, we can halve the space requirements for pointers in a doubly linked list -- but at what cost?

As usual, we propose you get your hands dirty with a programming challenge by the end of the post.

A regular doubly linked list

Linked lists are a huge deal. They are used everywhere, from the linux kernel to the custom memory allocator in the original Doom game and everywhere in between.

struct Node {
  int value;
  Node *prev;
  Node *next;
};

Stripped to the bare minimum, a node in a doubly linked list has an associated value and two pointers: prev points to the node before it, next points to the next node in the sequence.

Example of a doubly linked list with three nodes Figure 1. Example of a regular doubly linked list with three nodes

A chain of nodes represents an ordered sequence of values. Traversing all nodes can be done by following each node's next pointer. An example of this traversal is the PrintForward function:

void PrintForward(Node *node) {
  for (; node != nullptr; node = node->next) {
    std::cout << node->value << "\n";
  }
}

Analogously, we traverse it "backwards" by following the prev pointer instead.

Populating the list is done iteratively, by appending one node at a time. Here's a possible implementation of the Append function:

// Appends a new node with value `value` after `node`.
Node* Append(Node *node, int value) {
  Node *new_node = new Node{value, node, nullptr};
  node->next = new_node;
  return new_node;
}

You can play with a full example in this repl.it we set up for you.

Like... just use an array?

You may be excused to wonder why we should go through all this trouble just to store a sequence of values. We could simply use an array and store all values in a contiguous chunk of memory - no pointers. In reality, arrays and linked lists offer different tradeoffs for common operations.

We will visit some of these tradeoffs in the programming challenge, but for now, you might want to think about the procedures for: 1. Accessing the kth element 1. Appending a new element 1. Removing the kth element

for both an array and a linked list.

The XOR doubly linked list

Surprisingly, there is a way of making a doubly linked list Node using a single pointer, as opposed the our previous implementation that had two (prev and next).

struct Node {
  int value;
  Node* xaddr;
};

The trick is to store the XOR (exclusive OR - usually represented as operator ^) of the prev and next nodes addresses into a single variable, xaddr.

To make matters concrete, let's start with a single node n_0, whose value is set to 0. Since it's the single node in the list, both its previous and next nodes are null, which we interpret as 0. By definition, then, the single node's xaddr takes the value of null ^ null = 0 ^ 0 = 0.

A single node in an XOR doubly linked list

Figure 2. A single initial node in an XOR doubly linked list

Let's now append a new node at the end of the sequence. The new node, n_1, comes right after n_0. Here, I'm using the C notation &x to refer to the address of x. Note how the nodes' xaddrs are updated:

A XOR doubly linked list with two nodes Figure 3. A XOR doubly linked list with two nodes

Adding a third node after n_1 is more of the same:

A XOR doubly linked list with three nodes Figure 4. A XOR doubly linked list with three nodes

Generalizing this procedure, whenever we append a new node n_k+1 after n_k, we update n_k.xaddr by XORing it with n_k+1 address. In turn, n_k+1.xaddr takes the value of &n_k. Here's a possible implementation of Append:

// Appends a new node with value `value` after `node`.
Node* Append(Node *node, int value) {
  Node *new_node = new Node{value, node};
  node->xaddr = XOR(node->xaddr, new_node);
  return new_node;
}

Traversing an XOR doubly linked list

With our shiny XOR doubly linked list in hands, we turn to how to visit every value in the sequence.

Let's zoom in on a generic node somewhere in the middle of the list:

A node in the midle of an XOR doubly linked list Figure 5. A node n_k in the middle of an XOR doubly linked list

To find the address of the next node in the sequence, &next, we need to somehow extract it from n_k.xaddr, essentially undoing the XOR between &prev and &next.

An interesting property of the XOR is that it is its own inverse: (a ^ b) ^ b = a. In other words, if we had the address of the previous node, &prev, we could extract the address of the next node, &next, from n_k.xaddr:

&next = n_k.xaddr ^ &prev

By definition, in general, we don't have the address of the previous node. Except, not surprisingly, for the first node in the sequence, n_0:

A node in the midle of an XOR doubly linked list

Figure 6. The first node n_0 an XOR doubly linked list

We can use this knowledge and the address to n_0 to iteratively hop from one node to the next:

&n_1: n_0.xaddr ^ null = n_0.xaddr ^ 0 = n_0.xaddr
&n_2: n_1.xaddr ^ &n_0
&n_3: n_2.xaddr ^ &n_1
...
&n_k+1: n_k.xaddr ^ &n_k-1

This procedure is encoded in the following Traverse function:

// Traverse all nodes in an XOR doubly linked list.
// `node` is assumed to be the first node in the list.
void Traverse(Node *node) {
  Node *tmp, *prev = nullptr;
  while (node != nullptr) {
    std::cout << node->value << "\n";
    tmp = node;
    node = XOR(prev, node->xaddr);
    prev = tmp;
  }
}

We set up a runnable example with the Append and Traverse functions in this repl.it.

Programming challenge

  1. Implement the delete operation for both the regular and XOR doubly linked list. The function signature is Delete(Node *node, value), and it should remove the node with value value from the list. node points to the first element of the list (the head). You may use these repl.its as a starting point: regular, XOR.
    • In his TED talk, Linus Torvalds use the delete operation in a linked list to exemplify the subjective "good taste" in code. His positive example is interesting on its own, and it is discussed in detail here. Implement the elegant Delete operation.
  2. Prove that XOR is it's own inverse. TIP: Consider all possible combinations of two bits (a, b) and see if you can exhaustively show that (a ^ b) ^ b = a.
  3. The XOR doubly linked list saves up memory in comparison with the regular doubly linked list, at the cost of sacrificing some common operations. To understand this tradeoff, complete the following table with the Big-Oh time complexity of the referenced operations. If an operation is impossible, mark it so.
Regular doubly linked list XOR doubly linked list
Finding an element given the list head
Removing an element given the list head
Finding the next node given a pointer to a node in the middle of the list
Don't miss what's next. Subscribe to hackerclub.io:
Powered by Buttondown, the easiest way to start and grow your newsletter.