PostgreSQL Tips, Pitfalls, and Job Queues
Lots of PostgreSQL content this week. Let’s take a look!
Recommended Reading and Videos
The Founder Trap
I know at least a couple of founders running successful, ten or hundred-million-dollar businesses, who are miserable. They don’t like the company anymore and aren’t inspired by their work, but they’re trapped in a cage of their own design. They can’t walk in and say “I quit” without the company falling apart, and they haven’t been able to replace themselves.
A short, insightful article about how all the unexciting parts of running a business, having employees, and so on, can overwhelm the good parts of the experience. Read this not as discouragement, but more as an encouragement to give some extra forethought to this issue.
If our goal is to create fulfilling work, asking “what happens if this succeeds” is as essential as asking “how do I make this work?”
100 psql tips
If you use PostgreSQL at all, you probably use the psql command-line client sometimes, and so this is worth a look. A lot of the tips are somewhat redundant, but it’s still worth browsing through to pick up some new tricks.
10 Things I Hate About PostgreSQL
Continuing the PostgreSQL theme, this article is not so much a diatribe as well-intended criticism from someone who is deeply experienced with PostgreSQL. The author mentions that he has managed installations with a million different processes accessing a database at once; it doesn’t get much bigger than that.
What I’m Working On
Continuing more work on Ergo, I’ve spent this week adding job queues for both incoming events and actions triggered by tasks. In the learning spirit of the project, I’m implementing it mostly from scratch.
The queueing code uses Redis as the backing store but otherwise is totally custom. It’s only partially done so far but it’s coming along well. One fun part of this is Redis runs Lua scripts atomically, which is of course a nice property to have when moving items around inside the job queue’s data structures.
Jobs are primarily enqueued through a transactionally-staged job drain, which is a fancy term for putting your jobs into a Postgres table where they will shortly thereafter be added to the actual job queue by a “drain” process. This saves a ton of complexity by allowing the enqueue operation to be part of a SQL transaction.
Without this property, you have to be very careful about when you add jobs into the queue. If your code encounters an error, the database transaction will be rolled back but the enqueued jobs will still be in the queue. And even if it succeeds, your enqueued jobs may execute before the transaction finishes, leading the jobs to expect to find data in the database that hasn’t been committed yet.
Adding the jobs to a SQL table as part of a transaction takes care of both of these considerations. If the transaction fails, then the jobs are just never enqueued, and when the transaction succeeds, the jobs are added to Postgres at the same time as any data they rely upon. So everything just works and you don’t have to take special care to coordinate your database transactions with your job enqueueing code.
Furthermore, the job drain process can use transactions to drain the Postgres table, so that the jobs are only removed from the table once the jobs have actually been successfully submitted to the job queue. This moves any retry logic and error handling when communicating with the queue into a single place and relieves the rest of the code from having to deal with it.
I should finish up the queueing side of the project next week, and I plan to build this short story into a full blog post about the queue once I have it finalized and working.
If you enjoyed this, I’d love if you share it with a friend (sign up here) or just reply to this email with your thoughts. Thanks for reading!