Governing AI agents without killing them

        April 22, 2026

Governing AI agents without killing them
The governance post landed this week. Three failure patterns I've either hit or narrowly avoided, what governance-as-code actually looks like in TypeScript, and the scorecard that tells you which layer to fix first.

        Most governance advice for AI agents gets written for the org chart, not the agent. Review boards, RACIs, standing committees. The problem is that nothing in that stack runs on the same machine as your agent. The agent reads a prompt, picks a tool, sends an email. Whatever governance you wrote in Confluence doesn't get consulted.
This week's post is about the other version. Governance that lives in code. Tool sets an agent can't call past. Guardrail processors that abort a send when the recipient looks fabricated. Approval gates that pause execution at the framework level, not the prompt level. Structured decision logs that answer "why did the agent do that?" the way a regulator would want it answered.
Read the full post →
The piece covers three failure patterns I've either hit in production or narrowly avoided in the multi-agent system I run for my consulting business:

Tool sprawl widens the blast radius and degrades tool selection at the same time. One supervisor with access to everything is a governance problem and a performance problem from the same root cause.
Decision logs vs. tracing. Phoenix, Langfuse, and Mastra Studio answer "what did the model reason." Governance needs a second layer that answers "what did the system decide, under which rules, with what confidence," and stays queryable years later.
Rubber-stamp approval checkpoints. Anthropic's own Claude Code research found users approving 93% of permission prompts. That's muscle memory, not review. Tiered approval with full context is the version that actually adds signal.

Each pattern has TypeScript examples in the post. Mastra agents with scoped tool sets, a processOutputStep guardrail that aborts on fabricated recipients, and an approval flow where the reviewer sees the whole email, not a generic prompt.
If you'd rather start with a diagnostic than a long read, I built a companion Agent Governance Scorecard. 30 yes/no questions across four dimensions (tool access, observability, human-in-the-loop, defense in depth). Ten minutes gets you a number that tells you which layer to fix first.

This week's video: the capstone of the Agent Quality series
Part 3 is the capstone of the Agent Quality trilogy. Part 1 was the eval framework. Part 2 was observability. Part 3 closes the loop on a real Mastra agent. Add a custom LLM-as-judge scorer, catch an action item the agent fabricated from a meeting transcript, fix the prompt, and watch the score move from 0.83 to 1.00 on the next run.

Watch the video →
The quality thread and the governance thread are closer than they look. Evals tell you when the agent got something wrong. Decision logs tell you what it decided and why. Approval gates catch the ones you can't afford to get wrong. Same problem space, different layers.
This video is a paid partnership with Mastra. The instrumentation and scorer APIs shown are both native to the framework.
Resources mentioned:

Demo repo (Mastra meeting assistant)
Mastra tracing docs
Mastra scorers docs

One more thing
I also shipped a second post this week on adding audio narration to the blog. Short version: I cloned my voice with ElevenLabs and pointed it at my backlog. The pipeline worked, my writing didn't. Colons introducing questions, arrow characters, dense tables. All patterns that read clean on the page and fall apart in speech. If you write technical prose and have ever thought about audio, the transforms at the end are reusable.

Worth reading this week
Anthropic recently shipped Claude Code Auto Mode to address the approval-fatigue problem, and published a post walking through the design. I've been using it on my Max plan since launch and it's been a real upgrade. The part I didn't expect to enjoy: watching Auto Mode reject an action and reroute the agent to a safer path. Before this, I'd been running with --dangerously-skip-permissions, which lived up to the name. Auto Mode is the version where I stopped bypassing safety to get work done.

Reply and tell me which governance layer you'd tackle first if you had an afternoon. Tool scoping, decision logs, approval flow, or the defense-in-depth wiring around one high-stakes action. I'm curious where the actual gap is for people reading this.
If you're leading a team with agents in production and "it works, but I can't explain why it did that" sounds familiar, let's talk. 30 minutes, concrete roadmap, no slides.
Damian

                                Don't miss what's next. Subscribe to Damian Galarza | AI Engineering:

            Email address (required)

          Add a comment: