Ignore Previous Directions

Archives
May 8, 2026

Ignore previous directions 12: System programming experiments with AI

Nature

hare.jpg It has been the season where the hares hang out.

Events

I will be at a few events in the next few months. In just over a week I will be at Open Source Founders Summit in Paris. Then I am speaking at AI DevCon on June 2 in London (also streaming online), about high quality systems programming with AI at scale. Then on 5 June I will be speaking at State of Open Edinburgh. Hope to see you at one of those!

Experiments

Modern AI coding models are quite good at systems programming problems, if properly directed. They are extremely good at experiments, where you want to get a feel for whether something makes sense, and I wanted to show some examples of this. I think we could see a golden age of new things in systems programming, as a lot of this work is heavily detailed and time consuming, but actually quite well defined, and AI can be very helpful. I wanted to show how I go about these experiments and what is working well for me.

First thing is I do these on the side, when working on something else mostly. Unlike code where I care about the code detail more, I don't really mind much about code quality. The aim is just to decide if I like the thing, and often, whether it is easy to use, and how much complexity there is in the build, just so I can estimate what a production version might look like.

For example, while I write this I have Codex building a hypervisor. It only works on AMD, I am not interested in portability, just want something that runs, and it is using Qemu to test it. Within a few hours, given the AMD manual and a copy of Linux source code, it had Linux booting to userspace, all from scratch in Rust with no dependencies. Last time I tried this some months back I didn't get nearly that far. SMP guest support is partly done now, and the guests can see some PCI devices, even if they cant yet use them. It will be a while before it can answer the question I want answered, about whether a particular architecture is feasible, but that is ok, as it is a question I can't actually answer with existing hypervisors, and this is all very low cost and does not even require much thought, just a little steering.

I am going to run through another example that I just finished, that is in GitHub here. Recently, I spent some time working with Systemd's mini OS Particleos. This is basically an immutable OS you can build from a choice of Linux distros, with AB updates, and full support for secure boot and TPM based unlock of the root filesystem. It is pretty nice and easy to use, if you are basically assembling from existing distro packages, which is what the mkosi tool was originally designed for. But once you start building code as well, while it works the developer experience gets worse. It can set up a development environment for compiling, and has some caching, but this is harder to manage, and the builds get slower. When I built LinuxKit years ago we used container images for building code, but all the code was Go code which tends to have a simpler more lightweight build environment than, say, a patched version of Systemd. And managing the build containers and keeping them updated in LinuxKit has turned out to be one of the big problems.

So I decided to see what would happen if I used Nix as the build and caching system for a similar OS. The advantage of Nix is that you can build the same artefacts based on any set of dependencies directly on the host, just install the nix package from your distro. Well, on a Linux host, but mkosi was also Linux focused, while we made sure LinuxKit worked on a Mac too, but thats another thing, and I decided not to focus on cross compile or virtualisation/containers.

So I basically write up this pitch in OS usability and design in a pretty vague way, and gave the AI some source trees to examine for some related systems. That was my main contribution, other than a bit of steering, and some discussion to clarify. The AI (Codex) came up with the overview plan based on some discussion after its review of teh code and plan. The main things there were how to do AB boot partitions with Nix, whether they share a /nix store or not, we decided no to make them simpler and more reliable, so Nix is really more of a build plane not so much exposed at runtime. Plus we looked at sysext modularity, which we did try later but discarded as confusing and not really useful. We explicitly wanted to build some code (a trivial Rust application) and some code to manage updates through a trivial API. Notionally we wanted to test what the experience of patching Systemd was like, and we later ended up customising the kernel config.

We ended up building pretty much the whole set of features that ParticleOS supports that were interesting, and testing them out, just using the Systemd model, so secure boot, signed erofs immutable partitions with AB update, data partition being unlocked from TPM keys. I in no way validated that any of this is correct and secure, as this was really a developer experience experiment, but it is pretty close to the upstream implementation so if there are issues it should be fixable. And there is no attempt at key management. At one point the AI decided that Systemd was defaulting to being a bit lax and would mount unsigned erofs by default so it tightened that up, but mostly I was happy for it to run its tests.

Ah yes, its tests. I was not expecting it to write all its tests as Nix flakes, but yes, the entire experiment is basically 1500 lines of flake.nix with tests in, like this one, plus some small modules. Is this how flake-maxxing people write tests? It worked pretty well though and makes sense, it was just not exactly what I was expecting.

So what was the outcome? Definitely nicer to use and a better developer experience than the mkosi stack I would say. It was pretty easy to do the things needed, and the testing was reasonably quick and effective, for something that uses Qemu for tests. Obviously kernel compiles were fairly slow, but the caching was effective once built. The Nix store was not small (I think 13GB or so from memory just for this), and the image was about 250MB, with only small efforts to size it down by removing some of the unnecessary stuff. If I wanted to ship something soon that needed all the secure and trusted boot features, I would probably build this out.

However, my next experiment in this space is to build something more minimal. A number of the things I am building really need very little userspace anyway. Can I build out sufficient library userspace to meet the needs I have? We will see. And then back to unikernels? With the current rate of security vulnerabilities in C code and Linux there are advantages.

I think AI coding has an affinity with the approach of just build what you need now, rather than the build for general purpose broad use cases that open source has pushed us towards. If code is cheaper, everyone can solve their specific problems, rather than sharing a more complex general solution. At some level of complexity we want to reuse, but we might want to reuse in a different way. Check Moore said

Do not put code in your program that might be used. Do not leave hooks on which you can hang extensions. The things you might want to do are infinite; that means that each one has 0 probability of realization. If you need an extension later, you can code it later - and probably do a better job than if you did it now. And if someone else adds the extension, will they notice the hooks you left? Will you document that aspect of your program?

I think this type of philosophy is a good starting points for these experiments, but also how we should use these new coding tools.

Don't miss what's next. Subscribe to Ignore Previous Directions:
GitHub
Bluesky
LinkedIn
Powered by Buttondown, the easiest way to start and grow your newsletter.