Practical AI logo

Practical AI

Archives
Log in
Subscribe
May 20, 2026

Approval gates that actually hold

Why 'always confirm before sending' doesn't hold (and what does), plus three more places agent governance is showing up this month: Mastra's runtime layer, OpenAI's on-device PII filter, and Anthropic's Wall Street agent rollout.

This week's video

It's tempting to put governance in the system prompt. Something like "always confirm before sending email." It reads like a rule. It runs like a suggestion. Most of the time the model confirms. Some of the time it doesn't. The bug always shows up the week you stop watching.

This week's video walks through wiring a real approval gate into a Mastra agent — the sendMockEmail tool gets requireApproval: true, and the framework refuses to fire the tool without a human approval coming back. The model can't reason past it, because the gate doesn't live in the prompt. It lives in the runtime.

Building Approval Gates for AI Agents That Actually Hold

Watch the video →

The video extends the meeting-assistant agent from the last Mastra build, demos the approval flow end-to-end in Mastra Studio, walks the trace to show where the pause happens, and ends on approvals routed through Mastra Channels (Slack, Telegram) so a human can approve from wherever they already work.

This is a paid partnership with Mastra. The primitives shown are native to the framework, and I picked them because they're the cleanest implementation I've seen of the pattern.

Resources mentioned:

  • Demo repo (meeting-assistant + approval gates)
  • Mastra agent approval docs
  • Governing AI Agents Without Killing Them — the strategic case behind the build

Where the gate has to live

The thing that makes prompt-based approval fail is that the model is doing two jobs at once: deciding what to do, and deciding whether to ask first. Both are negotiable. A long context, a confident plan, a tool description that reads "use this to send the final message" — any of those quietly outweigh the "always confirm" line. You don't notice until something irreversible ships without a human eye on it.

The version that holds moves the second decision out of the model. The framework asks: "is this tool gated?" If yes, the tool call halts, the agent yields, and a human signal is the only thing that can resume it. The model isn't being asked to be disciplined. It's being structurally prevented from going further.

Once you've seen the pause fire at the framework layer, prompt-based approval starts looking like what it is: a polite request the model agreed to consider.

If your agent system has any tool calls that are expensive, irreversible, or externally visible, this is the dimension worth auditing first. The free Agent Governance Scorecard walks a team through the five dimensions of agent governance in about ten minutes, and human-in-the-loop is dimension three.


Governance is showing up everywhere

OpenBox AI x Mastra: runtime governance, one line. Mastra and OpenBox AI announced a partnership earlier this month that adds runtime governance to the framework with a one-line opt-in. Every tool call, workflow step, sub-agent message, and inter-agent handoff gets scored against vulnerability standards, with PII detection, content moderation, and human-in-the-loop approvals carrying cryptographic attestation. The video this week shows the primitive: a gate at the tool layer. This is the productized version of the same idea, with audit trails across the whole runtime. Adding governance to a runtime in retrospect is meaningfully harder than starting with it.

OpenAI Privacy Filter. OpenAI quietly shipped an open-weight model in late April for detecting and redacting PII before it ever reaches a cloud server. 1.5B parameters, runs locally on a laptop or in the browser, Apache 2.0 license. It sorts sensitive data into eight categories (names, addresses, emails, phone numbers, URLs, dates, account numbers, secrets) and scores 96% F1 on the standard PII-masking benchmark. The connection to this week's theme is direct. Approval gates control what an agent is allowed to do. Privacy filtering controls what data the model is allowed to see. Both are governance that runs in code, not in policy.

Anthropic ships 10 pre-built agents for Wall Street. On May 5, Anthropic announced a batch of reference-architecture agents for financial services: pitchbooks, earnings analysis, credit memos, KYC screening, month-end close, statement audits, insurance claims. The part worth pulling out isn't the breadth, it's that each one configures to a firm's existing modeling conventions, risk policies, and internal approval chains. That's the version that survives compliance review. Anthropic isn't pitching "smart autocomplete for finance," it's pitching agents that fit into the governance scaffolding the buyer already has. JPMorgan, Goldman, Citi, AIG, and Visa are all named as production deployments, and FactSet dropped 8.1% on the news — the market signaled which features it thinks matter.


If you're somewhere between "we know we need human-in-the-loop" and "we know it's working" and want a second pair of eyes on where the gates should live, reply to this email or book an intro call. Always happy to trade notes.

Damian

Don't miss what's next. Subscribe to Practical AI:
← Newer The 4 levers that diagnose broken AI agents Older → Your AI team doesn't need more people. It needs agents.

Add a comment:

You're not signed in. Posting this comment will subscribe you to this newsletter with the email address you enter below.
Website
YouTube
Twitter
Powered by Buttondown, the easiest way to start and grow your newsletter.