Damian Galarza | AI Engineering logo

Damian Galarza | AI Engineering

Archives
March 11, 2026

Extending Claude Code's native worktree support

The WorktreeCreate hook for database isolation, the agent-skills and codebase-readiness open-source releases, and what agent-ready development actually looks like at scale.

Anthropic shipped native git worktree support in Claude Code. The --worktree flag gets you most of the way there: file isolation, automatic branch creation, sub-agent support out of the box. But if you're running a Rails app with multiple parallel agents, they'll still collide on the same database. This week I show how to close that gap with the WorktreeCreate hook.

Watch the video →

Native Claude Code Worktrees


The hook source

As promised in the video, the WorktreeCreate hook script is in this week's blog post.

What it does: reads the session JSON from stdin, creates the worktree, copies gitignored files via .worktreeinclude, writes a .env.local with unique database names derived from the branch name, runs bin/setup, then prints the worktree path to stdout. That last step is the contract. Claude uses whatever path you print to know where to work.

The WorktreeRemove counterpart is in there too. It mirrors the creation logic in reverse, dropping both the dev and test databases when the worktree is cleaned up.

The hook was originally designed for teams not using git at all (SVN, Mercurial, no VCS). That's why it replaces the entire worktree workflow rather than extending it. Worth knowing if you read the Anthropic docs and wonder why the framing feels a bit off.


New open-source repo: agent-skills

I published a new repo this week: agent-skills. It's a collection of Claude Code skills I'm building out publicly. Right now it has one: a Buffer skill for scheduling social media posts directly from a Claude Code session. More coming as I pull skills out of my own workflows.


What agent-ready development actually looks like

OpenAI published a writeup in February on how their Harness Engineering team built a production product with zero manually-written code: 1 million lines, 1,500 PRs, 7 engineers, 5 months. I've been sitting with it for a few weeks and keep coming back to it: Leveraging Codex in an Agent-First World

A few things stood out that apply directly to anyone running Claude Code on a real codebase.

AGENTS.md as a table of contents, not an encyclopedia. They tried the one-big-instructions-file approach first. It turns out this introduces its own set of problems: context is scarce, stale rules accumulate, agents end up pattern-matching locally instead of navigating intentionally. Their fix was a short AGENTS.md (~100 lines) that acts as a map, with pointers into a structured docs/ directory maintained as the actual system of record. Agents start with a small entry point and are taught where to look next.

Architecture you'd normally wait for 100 engineers becomes a prerequisite. Strict layering, enforced dependency directions, mechanical linting. In a human team, these rules feel pedantic early on. With agents, they're what prevents compounding drift. Agents replicate whatever patterns already exist in the codebase, good or bad. If you don't encode constraints mechanically, entropy wins. They also write linter error messages to inject remediation instructions directly into agent context, so the agent knows how to fix what it broke.

Worktrees per task. Relevant to this week's video: they made the app bootable per git worktree so each Codex instance could launch and drive its own isolated copy, including logs, metrics, and ephemeral observability that gets torn down when the task completes. The same isolation gap I covered in the video, applied at a much larger scale.

Garbage collection as a recurring process. Every Friday used to be cleanup day — AI-generated drift, addressed in bulk. That didn't scale. Their fix: encode golden principles once, then run background agents on a cadence to enforce them, with targeted refactoring PRs that automerge when they pass. Technical debt treated like a high-interest loan, paid continuously rather than in painful bursts.

The throughput numbers are striking. What I find more interesting is the shift in what engineering work actually is when agents are doing the coding. Designing environments, specifying intent, building feedback loops. The hard work moves into the scaffolding.

This connects to something I've been thinking about: most teams struggling with AI agents aren't dealing with a model problem. They're dealing with a codebase problem. More on that in a future video.


This week I added a new plugin to my claude-code-workflows repo: codebase-readiness. It scores your repo across 8 dimensions (test foundation, architecture clarity, type safety, feedback loops) against the benchmark of teams shipping 1,000+ AI-generated PRs per week. You get a score (0–100), a band rating, and a prioritized improvement roadmap.

/plugin marketplace add dgalarza/claude-code-workflows
/plugin install codebase-readiness@dgalarza-workflows
/codebase-readiness

If the assessment surfaces gaps you want to close at the team level, the AI Workflow Enablement Program works through exactly that: CLAUDE.md system, shared skills, workshops built on your actual codebase.

What's the first thing that broke when you tried running Claude Code on a real project? Reply and let me know. I read every response.

Damian

Don't miss what's next. Subscribe to Damian Galarza | AI Engineering:

Add a comment:

Website
YouTube
Twitter
Powered by Buttondown, the easiest way to start and grow your newsletter.