NULL BITMAP by Justin Jaffray logo

NULL BITMAP by Justin Jaffray

Archives
Subscribe
December 1, 2025

More Data Independence and the History of the Relational Model

NULL BITMAP.png

Last week we talked about some of the reasons why we want data independence. This week I want to talk about some of the historical steps that brought us to where we are today, and why the relational model is a good fit for modelling data.

The first real database is often thought to be IDS, the "integrated data store," which was designed by Charles Bachman.

image.png

Bachman won the Turing award for his work on databases and his Turing award speech The Programmer as Navigator is a really important and interesting piece of writing if you want to understand how people were thinking about what databases even were, and how they should work.

It's funny to read this sort of thing because you realize how much about computers we take for granted in 2025. In this piece, he talks about the existence of hash tables as though they are a fancy novelty, and not one of the first things any new computer science student learns about, for example.

In the piece, Bachman outlines what he sees as the major innovation of the modern era: moving beyond a "computer-centric" view of data management to a "data-centric" view.

In sequential file technology, search techniques are well established. Start with the value of the primary data key, of the record of interest, and pass each record in the file through core memory until the desired record, or one with a higher key, is found. (A primary data key is a field within a record which makes that record unique within the file.)

What Bachman is describing is the mindset that people were in at this time, regarding what it meant to "process data" with a computer. We had computer programs, and if you wanted those computer programs to operate over data, you'd feed them the data you wanted them to process, and then you'd call it a day. Bachman's big point in this speech is describing an alternative model that was brought about by the advent of "direct access storage," where programs are not merely fed the entirety of a dataset, but are instead granted the leverage to navigate through a dataset. They could say, "I'm here at customer X, I'd like to now jump to customer X's orders, and from those to the products associated with those orders." He says that databases should all be adorned with such pointers that connect records to other associated records, and this sort of interface is the thing that will allow us to finally, once and for all, decouple the structure of data from the structure of our programs. This is what he means by the programmer as "navigator," it's a person who is navigating the space.

Now, these sorts of things were all great ideas. Great ideas! The notion that programmers should work with something more abstract than "here are a bunch of bytes" was truly revolutionary. And this is why Bachman deserved the Turing award for his work.

That's why I think it is sort of funny that Ted Codd was not afraid of saying that he thought Bachman's network model was bad. In fact, he and C.J. Date at SIGFIDET (which would later become SIGMOD) the very next year, had an entire paper about how they found Bachman's model to be deficient:

ABSTRACT: For some time now there has been considerable debate in the field of database systems over the fundamental question of the underlying design philosophy of such a system. The controversy has centered on the structure of the programmer interface, though of course the design chosen for this interface has repercussions throughout the rest of the system. Two approaches to this problem have received particular attention: the network approach, which is typified by the proposals of the CODASYL Data Base Task Group (DBTG), and the relational approach, which is advocated by the present authors (among others). The purpose of this paper is to give some comparisons between these two approaches (primarily from the application programming viewpoint), and to show what the authors believe to be the advantages of the relational approach. The reader is assumed to have a basic familiarity with the two approaches.

image.png

Eight years later, Codd won his own Turing award for the relational model.

So, what exactly are the limitations of the network model that were to Codd, so glaring that he had to invent the relational model to correct them?

You can of course read Codd's own writing on this, there's no dearth of it. But I think what it primarily comes down to is data independence, and its ability to decouple logical from physical schema.

A cool thing about the relational model in particular is that it doesn't actually prescribe any particular kind of access method for a relation. You say "this is the data I have," and then, separately, you say "here are the ways to access my data." Where in the network model, you'd have explicit links between records that were related, which meant that the way you'd access them was only by traversing those links. This, to Codd, was violating data independence: in the network model, the data is the fact that there is a link between the two records, and also the way you accessed the data was by traversing that link. In the relational model, we have foreign keys, a piece of data that denotes what the associated records are, and then separately, we might have an index that provides quick access to related records. But that index is not an integral part of the interface exposed to users. They just write queries over flat tables.

Codd's Information Rule says:

All information in a relational data base is represented explicitly at the logical level and in exactly one way – by values in tables.

And this is the heart of data independence, there is no implicit structure baked into the logical data, it's just data.

I think all of the writing from this era is really interesting and provides a lot of perspective on why the systems we have today work the way we do. There's a lot of things that we take for granted that are actually quite insightful solutions to problems that you'd never realize unless you saw what came before.

Don't miss what's next. Subscribe to NULL BITMAP by Justin Jaffray:

Add a comment:

GitHub
Website favicon
Bluesky
X
Powered by Buttondown, the easiest way to start and grow your newsletter.