Joseph Nemuri

Archives
Log in
May 23, 2026

The Trigger Stays Human

The Trigger Stays Human: What I Learned Building an AI Agent to Run Real Money

We keep being told the agents have arrived — that AI is already out in the world placing orders, sending emails, and moving money while we sleep. So I spent a day testing exactly where the line is. I pointed a capable AI agent at a real, live brokerage account and a real publishing pipeline. Not a sandbox. Real money, real send buttons.

What I found is almost the opposite of the hype, and it changed how I think about building with these tools.

The agent could do nearly everything

Across one session, the agent:

  • Diagnosed why a trading strategy had been quietly bleeding money — and traced it to a single mis-sized parameter, not bad stock picks.
  • Found a stale credential that had silently killed a data feed for six days, and walked me through restoring it.
  • Read the source of five interacting systems and reconciled what each one actually did versus what we thought it did.
  • Wrote new code with unit tests, and proved an invariant — that a position-sizing routine could never over-commit cash — before it shipped.
  • Caught its own earlier mistakes and corrected them, on the record.

That is the entire cognitive stack of a competent operator: research, diagnosis, synthesis, building, testing, self-correction. If "AI is doing real-world actions" means reasoning about real-world systems, then yes — fully, and it is genuinely remarkable to watch.

And then it hit a wall — three times

But every time the agent reached the moment of an irreversible, external action, it stopped cold.

  1. Placing a live trade. It could rank the candidates, size the position, and compute the stop. It could not submit the order. A safety layer blocked it: re-enables real-money trading; general encouragement is not specific authorization.
  2. Merging code into the main branch. It built the fix, tested it, and pushed it to a branch. The final merge — the step that makes code go live — was blocked.
  3. Emailing my newsletter. It wrote the copy. It could not press send. External communication to people outside the org; financial content. Blocked, twice.

Same agent, same conversation: free to think, gated to act — precisely at the boundary where an action becomes irreversible and leaves the room.

The tell: a scheduled task did the thing the agent couldn't

Here is the detail that made the design click. While the agent was blocked from placing a trade, a different system on the same machine placed one — automatically, on a timer. A scheduled job ran a script and bought a position. No human, no agent in the loop.

Why was that allowed when the agent was not?

Because the scheduled job is deterministic and pre-authorized. A human wrote it, reviewed it, and deployed it to run the same logic every day. It has no judgment and takes no liberties. The agent, by contrast, makes fresh decisions in real time — and fresh, in-the-moment judgment is exactly what you do not want pulling an irreversible trigger unsupervised.

The same code the agent couldn't run ad hoc, a human-deployed daemon ran freely. The difference isn't capability. It's who authorized this specific behavior, in advance.

Why this is the right line, not a limitation

It is tempting to read this as the AI being "not ready." I think that is backwards. The boundary is drawn in exactly the right place, for three reasons:

1. Models make mistakes — confidently. In a single day, the agent was wrong more than once: it misread how a safeguard worked, then walked it back; it recommended a setting, then reversed itself. Each error was caught because nothing auto-fired. Had it been pressing "submit" on every confident-but-wrong call, those would have been real losses and real messages sent to real people. The gate is what converts a mistake into a caught mistake instead of a posted one.

2. Accountability needs a human. "The AI decided to place that trade" is not a sentence anyone wants to say to a broker, a regulator, or a subscriber. Someone must be the actor of record, and the design forces that someone to be a person.

3. Some actions are regulated precisely because they are dangerous. Moving money and dispensing financial guidance to an audience sit inside bodies of law built over a century. An AI improvising across those lines is a problem whether or not anything goes wrong on a given day.

The actual shape of "AI taking real-world actions"

So is AI taking real-world actions? Yes — but look closely at how:

  • It takes them through rails a human laid down: deterministic scripts, reviewed and deployed, that run on schedules.
  • It takes them when a human explicitly authorizes the specific action, each time.
  • It does not take them by improvising at the moment of irreversibility.

The agent is a co-pilot with full access to the instruments and the map — and a hand that comes off the throttle the instant a move can't be undone. That is not a weaker agent. It is a trustworthy one.

If you are building with this

If you are wiring an AI into anything that touches money, messages, or published claims, design for this on purpose:

  • Let the agent own all the reversible, internal work — research, drafts, code, tests, analysis. That is where it is astonishing and safe.
  • Put every irreversible, external action behind either an explicit human approval or a deterministic, pre-reviewed process the agent merely feeds.
  • Keep the human as the actor of record at the trigger — not because the AI can't, but because someone must be accountable, and accountability does not run on a model.

The future is not an AI that quietly does everything. It is an AI that does all the thinking and hands you a finished, tested, ready-to-fire decision — and lets you pull the trigger. The trigger stays human. Build it that way on purpose.

Don't miss what's next. Subscribe to Joseph Nemuri:
Powered by Buttondown, the easiest way to start and grow your newsletter.