Predicate, NOT IN, and Casting Text To Bool (Again)
#003 — AUGUST 20, 2025
Django Contributions Weekly - Y25 W33 - Aug 11 - 17
Overview
There were a total of 30 modified tickets last week. By the end of the week, 14 of those were open, and 16 were closed.
Components
Number of modified tickets by component.
- Database layer (models, ORM): 10
- contrib.admin: 9
- Core (Management commands): 2
- contrib.staticfiles: 1
- Migrations: 1
- Documentation: 1
- Uncategorized: 1
- Internationalization: 1
- Utilities: 1
- Error reporting: 1
- contrib.contenttypes: 1
- Generic views: 1
The distribution of components doesn't change much from week to week. Perhaps this information can be paired with the distribution of tickets that have been abandoned, specifically tickets sitting in the Needs Patch, Needs Review, and Needs Improvement queues. Or perhaps a percentage change is a better measure than the absolute count.
Activities
Count of activity changes to the ticket statuses last week.
- assigned: 3
- reviewed: 4
- coded: 5
- discussion: 46
It's becoming apparent that analyzing the "activities" using tickets as the only source of data is not always meaningful, because work done on PRs are not always captured by the tickets. Scraping GitHub recent PRs could be a good step to fill in this information gap.
Contributors
There were 16 contributors last week.
- Mariusz Felisiak (felixxm), 1
- Jacob Walls (jacobtylerwalls), 1
- umesh singh bisht (Visiter01), 1
- khosro_o (khosro_o), 1
- pTr (ptrgits), 1
- Jason Hall (Jkhall81), 1
- Siburg (Siburg), 1
- Will-Ruddick (Will-Ruddick), 1
- suhail vs (suhailvs), 1
- john-parton (john-parton), 1
- blighj (blighj), 1
- David Smith (smithdc1), 1
- Florian Demmer (fdemmer), 1
- Natalia Bidart (nessita), 6
- Antoliny (Antoliny0919), 5
- Sarah Boyce (sarahboyce), 16
Events
- PyLadies Con 2025 takes place Dec 5-7. The Call for Proposal is open now until Aug 22.
- DjangoCon US 2025 announces Keynote Speaker Zags (Marlene Mhangami)! The conference takes place Sep 8-12 in Chicago, IL, US.
Predicate
If the word predicate in the context of databases sounds scary to you, don't worry, it's not. It's something you probably work with on a regular basis without knowing the name of it. I heard of the term when I was watching the talk Multidimensional Search Strategies For Composite B-Tree Indexes by Peter Geoghegan. I didn't know what it was, so I looked it up.
A predicate is an expression that evaluates to a boolean. Some definitions will describe a predicate as a function. Whether it is an expression or a function, a predicate essentially takes in input values and returns a boolean value. Wikipedia's definition is a bit mathematical and abstract, but SQL Server is more on point.
To give a concrete example, consider the following table:
create table book(
id int8,
author varchar(128),
title varchar(128),
published_date timestamp,
pages int8
)
You might have a query that filters by pages
:
select * from book where pages > 100
In this case, pages > 100
is a predicate. In the context of the multidimensional search strategies, the speaker was talking about optimizing the SQL query planner. If the predicate involves a column that has an index, the predicate can be used to determine which indexes to search and retrieve, and which indexes to ignore.
NOT IN
NULL
is in interesting data type, and it brings some surprising results in SQL. In last week's email, we saw that casting NULL::bool
results in NULL
. This week, I picked up the book "PostgreSQL Mistakes and How to Avoid Them" by Jimmy Angelakos. The first mistake mentioned in the book involves the NOT IN
predicate. When the values in the NOT IN
expression includes at least one NULL
value, the predicate will evaluate to NULL
.
For example, using the book
table from above, if we have the following rows in the table,
id | author | title | published_date | pages
---+-----------------+-----------------------+----------------+------
1 | Lea Verou | CSS Secrets | 2015-07-28 | 388
2 | Robert Nystrom | Crafting Interpreters | 2021-07-28 | 640
3 | Jimmy Angelakos | PostgreSQL Mistakes | 2025-06-01 | 232
4 | Chip Huyen | AI Engineering | 2025-01-07 | 532
5 | | Beowulf | |
and we want to know all the books that were not written by Jimmy Angelakos or Robert Nystrom, we might write a query as follows:
select * from book where author NOT IN ('Jimmy Angelakos', 'Robert Nystrom')
This would return 2 results:
id | author | title | published_date | pages
---+-----------------+-----------------------+----------------+------
1 | Lea Verou | CSS Secrets | 2015-07-28 | 388
4 | Chip Huyen | AI Engineering | 2025-01-07 | 532
Notice how the book Beowulf was not included in the results? This is because the author of Beowulf is NULL
, and evaluating the row against the NOT IN
expression results in NULL
, so the book Beowulf is not included in the results.
If we want to know all the books that were not written by Jimmy Angelakos, and we also want to exclude books with NULL
authors, we might naively write the query as follows:
select * from book where author NOT IN ('Jimmy Angelakos', NULL)
In this query, the NOT IN
predicate will also evaluate to NULL
, and no rows will be returned.
PostgreSQL explicitly recommends against using NOT IN in their documentation. They recommend using NOT EXISTS
instead. In these two examples, the values of NOT IN
are constants. The same applies when the values are evaluated from a subquery as well.
If you're curious about real world issues related to NOT IN
or NOT EXISTS
, you can search for filed bugs in major projects. For example, Django ticket #34597 is an open ticket that touches on performance issues on the database level as well as the ORM compiler level. The comments include lots of interesting resources about databases.
Cast Text To Bool (Again)
Last week's email mentioned the valid and invalid inputs for casting text to boolean in Postgres. The valid values were found by trial and error. We saw that 'true' and 'false' are one pair of valid input, along with their prefix variation 't' and 'f'. We also saw that '1' and '0' (text, not integers) are another pair of valid values.
This week, I looked up the documentation for the boolean data type. Turns out, there are another two pairs of valid input values:
- 'on' and 'off'
- 'yes' and 'no'
I made a 5-minute video walking through the valid and invalid values. You can also cast integers to boolean, but I'll leave that exercise (aka rabbithole) to you.
Artificial Coding Friends - by Matt Stein
Matt Stein writes his thoughts on AI and agents. I found this perspective interesting to understand the expectations and misunderstandings that people have when they use the tools.
Survey of SQL:2011 Temporal Features - by Paul Jungwirth
In this post from 2019, Paul Jungwirth walks us through the implementation details of temporal features across multiple databases, specifically MariaDB, IBM DB2, Oracle, and MS SQL Server. This work is the precursor to the implementation of temporal features in PostgreSQL.
Tutorials, Testathons, Sprints, Space Reviewers - by Ontowhee
Self plug here. I reflect on my personal growth through the lens of open source coding events.
Codeberg Pages And Custom Domain
I wanted to set up my personal page on Codeberg. This guide helped me set up the apex domain using the A/AAAA record approach. There are more things you can configure, such as pages for your projects (within a project repository), although I haven't needed that yet! Hope this helps anyone who is looking to set up a personal page or project.
Codeberg's documentation provides an example .domains file. I found the examples extremely helpful, because it was concrete and very easy to follow. The example was working off a pages
branch within the pages
repository. On the other hand, I was working off the main
branch in my pages
repository, so I adapted it accordingly. You can find my initial .domains file in commit 47a83d25ec:
ontowhee.com
ontowhee.codeberg.page
pages.ontowhee.codeberg.page
main.pages.ontowhee.codeberg.page
However, I ran into an error message with that file: Misdirected Request (Error 421)!
, "domain not specified in .domains
file". This was telling me there was no domain in the .domains file, even though I had added the values in there following the example.
I ended up simplifying the content in .domains to just the first two items, in commit 718488c5dc:
ontowhee.com
ontowhee.codeberg.page
I came to this conclusion because I tried to access pages.ontowhee.codeberg.page directly, and it didn't seem to have SSL installed at that location. Similarly for main.pages.ontowhee.codeberg.page.
I don't fully understand the configuration, but it works now. I wonder if specifying the path to the branch and project folders are more for repositories that contain both code and documentation. Perhaps the page
repository is a special case that doesn't require this level of configuration.