Kicking the Tires on CedarDB's SQL
Last week it was my birthday and I had a bunch of stuff going on over the weekend so a short and simple post this week!
If you are a database sicko (and I can only assume you are if you are reading NULL BITMAP), you are likely familiar with the Technical University of Munich (TUM) as the premier source of database research. CedarDB is a new database that has spun out of a bunch of the research done at TUM and is an attempt to turn it into a commercial Postgres-compatible database.
As an aficionado of SQL engines I threw a couple queries at this thing to see what I could possibly divine about its implementation. Here are some of the things I learned.
Some of these observations could be interpreted as deficiencies; I want to be clear that none of this is intended to be critical, this is an early release of a very complicated, very ambitious piece of software. I am broadly very positive about the future of CedarDB, these are the ravings of someone who has dealt with the pain of implementing a SQL optimizer.
CedarDB has the best implementation of query decorrelation I've seen. Query decorrelation is the process of taking subqueries containing columns from some outer scope and turning them into joins.
Fully general decorrelation, to my knowledge, requires DAG-shaped plans. This was a conclusion reached independently by Jamie Brandon and the TUM folks themselves. This tends to make doing decorrelation well particularly difficult, because once you've taken a query plan that was a tree and turned it into a DAG, it becomes much, much more difficult to optimize well. This means there is some trickiness in deciding how and when to do the decorrelation, and to what extent you want to try optimizing the undecorrelated queries.
I stole some hairy correlated queries from an old job and ran them through CedarDB. We can see a handful of interesting things in its handling of them. Don't worry about what this query is actually doing, it's mostly nonsense, but we can learn some stuff from the output of the plan.
explain (verbose) SELECT *
FROM ab
WHERE EXISTS
(
SELECT * FROM xy INNER JOIN (SELECT count(DISTINCT uv.v) AS cnt, sum(v) FROM uv WHERE a=5 GROUP BY v) ON x=cnt
);
plan
---------------------------------------------------------
🖩 OUTPUT (Estimate: 1) +
│ Output columns: 5 as "a", b +
⨝ JOIN (leftsemi, bnljoin) (Estimate: 1) +
│ cross join +
├───🗐 TABLESCAN on ab (Estimate: 1) +
│ Attributes: b ▍ Restrictions: a = 5 +
├───⨝ JOIN (inner, hashjoin) (Estimate: 3) +
│ │ Join condition: x = "count(v)" +
│ ├───𝚪 GROUP BY (Estimate: 1) +
│ │ │ Aggregates: count(v) as "count(v)" ▍ Key: v5+
│ │ 𝚪 GROUP BY (Estimate: 1) +
│ │ │ Key: v6, v6 +
│ │ ⨝ JOIN (inner, singletonjoin) (Estimate: 1) +
│ │ │ cross join +
│ │ ├───σ SELECT (Estimate: 1) +
│ │ │ │ (a = 5) and (a is not null) +
│ │ │ 🗏 CTE Scan on CTE Alder (Estimate: 1) +
│ │ └───🗐 TABLESCAN on uv (Estimate: 1) +
│ │ Attributes: v +
│ └───🗐 TABLESCAN on xy (Estimate: 9) +
│ Attributes: x ▍ Restrictions: x is not null +
└───𝚪 GROUP BY (Estimate: 1) +
Key: 5 +
+
This plan uses:
(1 row)
Here's a couple observations:
- I like the plan format! They use a bunch of fun greek symbols like 𝚪 for aggregation and some others that don't appear to render on my computer (lol).
- They have a daggy plan by way of introducing a CTE, which they name
Alder
. I reached out to someone in Discord to ask them what "Alder" was, and they pointed me at this Wikipedia page. So I suppose that CedarDB, when it has to introduce CTEs for its planning, names them sequentially and alphabetically after trees. (Once they're doing 26 of them they just start using numbers:)
...
│ └───𝚪 GROUP BY (Estimate: 1) +
└───⨝ JOIN (leftsemi, hashjoin) (Estimate: 1) +
├───🗐 TABLESCAN on a (Estimate: 1) +
├───⨝ JOIN (leftsemi, bnljoin) (Estimate: 1) +
│ ├───σ SELECT (Estimate: 1) +
│ │ 𝚾 MAP (Estimate: 1) +
│ │ ⨝ JOIN (fullouter, hashjoin) (Estimate: 1) +
│ │ ├───𝚾 MAP (Estimate: 1) +
│ │ │ ⨝ JOIN (inner, bnljoin) (Estimate: 1) +
│ │ │ ├───𝚾 MAP (Estimate: 1) +
│ │ │ │ σ SELECT (Estimate: 1) +
│ │ │ │ 🗏 CTE Scan on CTE 45 (Estimate: 1) +
│ │ │ └───🗐 TABLESCAN on uv (Estimate: 1) +
│ │ └───⨝ JOIN (inner, bnljoin) (Estimate: 1) +
│ │ ├───🗏 CTE Scan on CTE 45 (Estimate: 1) +
│ │ └───🗐 TABLESCAN on xy (Estimate: 1) +
│ └───⨝ JOIN (inner, bnljoin) (Estimate: 1) +
│ ├───σ SELECT (Estimate: 1) +
│ │ 🗏 CTE Scan on CTE 45 (Estimate: 1) +
│ └───🗏 CTE Scan on CTE Birch (Estimate: 1) +
└───𝚪 GROUP BY (Estimate: 1) +
...
- There's some weirdness with the
GROUP BY
s here. Notice that one of them has the same column twice in its key:v6, v6
. This can be simplified, and I'm surprised to see the column set seems like its not represented as a set at all. - There appears to be a bit of a bug where the definition of
Alder
is not shown, as it should, under "this plan uses." Some of the hairier queries I wrote enumerated at least some of their CTEs in this section. - They have some plans categorized as "singleton joins," which is cool, presumably it means that join is 1:1, which means that you can remove things from your hash table as you find matches (or, more likely, if you find a second match, you error, because it means a subquery logically returned multiple rows).
Can Something be More Than True
Here's a question: can this query ever return more than one row?
SELECT 1 WHERE p
As far as I know, in Postgres, the answer is no. In CedarDB (and other databases that also do more sophisticated decorrelation than Postgres does), the answer is...sometimes:
postgres=# SELECT 1 WHERE (VALUES (true), (true));
?column?
----------
1
1
(2 rows)
The reason this might ever be true is that if you have to do query decorrelation it's often convenient to translate WHERE
clauses into joins:
SELECT ...something cols... FROM something WHERE (SELECT 1 FROM something_else WHERE p)
=>
SELECT ...something cols... FROM something INNER JOIN something_else ON p
(subject to some plan-or-run-time checks that all the semantics are proper)
CedarDB does do these checks in some other contexts:
postgres=# CREATE TABLE one (x int);
CREATE TABLE
postgres=# insert into one values (1);
INSERT 0 1
postgres=# SELECT x FROM one WHERE (VALUES (true), (true));
ERROR: more than one row returned by a subquery used as an expression
Presumably there's some kind of constant-folding going on in the first case that's dodging the "more than one row" check. I don't think any of this has consequences for real-world use but it does perhaps give some insight into the implementation.
Other Stuff
I also found a couple assertion failures and a crash, which have been reported to CedarDB but are mostly uninteresting to enumerate here.
I'm looking forward to experimenting with CedarDB more at some point and getting a better feel for its query planner. If anyone from CedarDB wants to chat about how their query planner works for NULL BITMAP please reach out :)