Why NULLs don't work the way you think in SQL
Answer: Ternary logic.
If you’ve learned programming through a normal programming language, say, Java, Javascript, C++, Python, … etc., you’ve been trained that null is a value, a special value. In your training, it means the absence.
However, in SQL, the situation is different: NULL doesn’t mean the absence, it means UNKNOWN in the so-called ternary/three-valued logic.
We’ll see what that means. But first, let’s see how NULL breaks our expectations in SQL.
5 != NULLis notTRUE! (but notFALSE, either. We’ll see)NULL = NULLis notTRUE! (but notFALSEeither)5 < NULL,5 = NULL, and5 > NULLare all notTRUE. (but notFALSEeither)
In SQL, NULL is not a special value. In SQL, NULL means we don’t know the value of this column for this row. That’s how SQL is specified.
For example, a user’s address being NULL in SQL doesn’t mean this user is homeless/doesn’t have an address; it means this user’s address is unknown in the system.
So, if you ask SQL if a user’s address being NULL is not equal to 221B Baker Street, SQL won’t answer with TRUE since it doesn’t know this user’s address to be able to tell that it’s not 221B Baker Street. However, it won’t answer with FALSE either; it doesn’t know the user’s address to tell you that it’s equal to 221B Baker Street. So, what answer will it give? … SQL’s answer: Don’t know!, which is NULL in SQL.
I.e., 5 != NULL, 5 = NULL, NULL = NULL, 5 < NULL, 5 > NULL, … etc. all evaluate to NULL in SQL because NULL means UNKNOWN and if you compare something to an unknown value, you can’t know the comparison result – hence, the NULL result.
This is because SQL is designed to implement the so-called ternary/three-valued logic instead of Boolean logic. Instead of a proposition being TRUE or FALSE, it can be TRUE, FALSE, or UNKNOWN. And it has its truth tables accordingly. The UNKNOWN I’ve been highlighting is UNKNOWN as in ternary logic. I.e., NULL in SQL is the equivalent of UNKNOWN in ternary logic.
Sample truth table in ternary logic:
| A | B | A OR B |
|---|---|---|
| TRUE | TRUE | TRUE |
| TRUE | UNKNOWN | TRUE |
| TRUE | FALSE | TRUE |
| UNKNOWN | TRUE | TRUE |
| UNKNOWN | UNKNOWN | UNKNOWN |
| UNKNOWN | FALSE | UNKNOWN |
| FALSE | TRUE | TRUE |
| FALSE | UNKNOWN | UNKNOWN |
| FALSE | FALSE | FALSE |
However, let’s be real: that’s now what we mean when we use NULL is SQL. We usually mean absence of value and/or that NULL is a special value.
For example, what we mean with a tweet’s reply_to being NULL is this tweet is a brand new tweet, not a reply to another tweet, not we don’t know what this tweet is replying to as SQL thinks.
So, how do we survive in this reality?
- Operators
IS NULLandIS NOT NULLare your friend. You probably already know what they mean so won’t waste your time here. But the conclusion: always check withIS NULL/IS NOT NULLfirst if the column is nullable. Example:address IS NULL OR address != '221B Baker Street. - If you find yourself in a situation where you’re using some custom wrapper that doesn’t allow you to write full equivalents of SQL, e.g., one that just takes
filters: Array<{columnName: string, filterValue: string}>and the column is nullable, make sure to check theNULLcases by hand and see if they’ll evaluate to what you want.
Forgot to tell you: If the result of the WHERE clause for a row is NULL, SQL treats that as FALSE and thus won’t include this row in the result set.
Also, don’t fully depend on the ternary-logic thing in SQL, unfortunately. For example, UNION(NULL, NULL) should be {NULL, NULL} but it’s just one NULL…
I.e., use SQL‘s NULL as we mean it: absence of value. But make sure to use IS NULL/IS NOT NULL properly.
Ternary logic might be useful and might be a good design paradigm for some use-cases, but I wouldn’t depend on SQL being a ternary-logic system – I’d implement my own on top of it/use another library.
Resources
- Oracle DB’s page on NULLs, very nice and to the point.
- Wikipedia’s section on SQL being three-valued logic application.
- Julia Evans‘s NULL surprises, a page from her zine Become a SELECT star!.
Disclaimers
- For literary purposes (or otherwise), I may not have described everything accurately. Or may have made mistakes.
- I’m not responsible for anything in this article. Read on your own risk.
Privacy notes
- Buttondown tracks who opens, clicks, or forwards my emails, and shares this data with me. If you want to delete this data, email me and I’ll try my best!
- Buttondown adds “UTM sources” to my links to let websites I’ve linked to know that the source of the new traffic is my newsletter (as far as I understand).
- Buttondown gives me the choice to opt out of both points above, but I’ve chosen to opt in.
Copyright
This is an All-Rights Reserved work. Please, don’t rehost it.
Linking to it is naturally allowed (and appreciated!), but rehosting is not. Reasonable sharing like forwarding to a few friends or a quote screenshot is fair use, I believe.
You can reply directly to this email.