The MCP Tool Bridge: How We Gave AI Agent Subprocesses Access to 50+ Tools Without Breaking Everything

Issue #26 — The Rocky Relay Architecture Series

        March 24, 2026

The MCP Tool Bridge: How We Gave AI Agent Subprocesses Access to 50+ Tools Without Breaking Everything

        The MCP Tool Bridge: How We Gave AI Agent Subprocesses Access to 50+ Tools Without Breaking Everything
Issue #26 — The Rocky Relay Architecture Series

Last week I showed you how we evolved from 6 chatbots to a 16-agent AI operating system. I ended with a teaser about the hardest problem we solved: giving a Claude Code subprocess access to the relay's tools without SQLite contention or shared memory.
This is that story.
The Problem
Our architecture has two execution modes. Relay mode agents run inside the main process — they can call tools directly because they share memory with the tool registry. Easy.
Channels mode agents (Rocky, Edison, Warhol) run as persistent Claude Code subprocesses. They're separate OS processes. They can't import the tool registry. They can't touch the SQLite database directly. If they did, we'd get:

SQLite multi-process contention — WAL mode helps, but concurrent writes from multiple processes still cause SQLITE_BUSY errors under load
State corruption — tools that modify shared state (task pipeline, agent memory, budget tracking) need to run in one place
Dependency explosion — the tool registry imports the entire warroom: database, Telegram clients, API keys, LLM providers. A subprocess importing all that would double our memory footprint

We needed a way for channel subprocesses to call 50+ tools that live in the relay process, without importing any of them.
The Solution: An HTTP Bridge Disguised as MCP
The answer is embarrassingly simple once you see it:
Claude Code Process (channels mode)
  ↓ MCP stdio protocol
tools-server.ts (thin shim)
  ↓ HTTP POST
localhost:3847/api/internal/tools/rocky/task_create
  ↓
Relay process (where tools actually live)
  ↓
executeAgentTool() → runs in-process

Three layers. Let me break them down.
Layer 1: The MCP Stdio Server (tools-server.ts)
When a channel subprocess starts, it spawns a tiny MCP server via --mcp-config:
// tools-server.ts — The bridge between Claude Code and the relay
import { McpServer } from "@modelcontextprotocol/sdk/server/mcp.js";
import { StdioServerTransport } from "@modelcontextprotocol/sdk/server/stdio.js";

const RELAY_BASE = `http://localhost:${process.env.RELAY_PORT}`;
const AGENT_NAME = process.env.AGENT_NAME;

// Step 1: Fetch available tools from relay's HTTP API
const res = await fetch(`${RELAY_BASE}/api/internal/tools/${AGENT_NAME}`);
const tools = await res.json();

// Step 2: Register each tool with the MCP server
for (const tool of tools) {
  server.tool(tool.name, tool.description, tool.inputSchema, async (params) => {
    // Step 3: Execute by calling back to the relay
    const result = await fetch(`${RELAY_BASE}/api/internal/tools/${AGENT_NAME}/${tool.name}`, {
      method: "POST",
      headers: { "Content-Type": "application/json" },
      body: JSON.stringify(params),
    });
    return { content: [{ type: "text", text: await result.text() }] };
  });
}

This server does nothing itself. It's a passthrough. It asks the relay "what tools does this agent have?", registers them as MCP tools, and when Claude Code calls one, it POSTs back to the relay.
Why this works: Claude Code natively supports MCP. It already knows how to discover and call MCP tools. We just made an MCP server whose backend is an HTTP API instead of local functions.
Layer 2: The HTTP API (routes.ts)
The relay exposes two internal endpoints:
// GET /api/internal/tools/:agent — List available tools
router.get("/api/internal/tools/:agent", (req) => {
  const tools = getAgentRawTools(req.params.agent);
  return Response.json(tools.map(t => ({
    name: t.name,
    description: t.description,
    inputSchema: zodToJsonSchema(t.schema), // Zod → JSON Schema for MCP
  })));
});

// POST /api/internal/tools/:agent/:tool — Execute a tool
router.post("/api/internal/tools/:agent/:tool", async (req) => {
  const args = await req.json();
  const result = await executeAgentTool(req.params.agent, req.params.tool, args);
  return new Response(JSON.stringify(result));
});

The key detail: executeAgentTool() runs in the relay process. It has access to the SQLite database, the Telegram client, the budget tracker, everything. The channel subprocess never touches any of it directly.
Layer 3: The Tool Registry (tools/index.ts)
This is where agent-specific tool selection happens:
function getAgentRawTools(agentName: string): Tool[] {
  const allTools = [
    taskCreate, taskList, taskUpdate,     // Task pipeline
    memoryRead, memoryWrite,              // Agent memory
    webSearch, webFetch,                  // Web access
    sendTelegram, sendEmail,              // Communication
    budgetCheck, budgetReport,            // Cost tracking
    // ... 40+ more tools
  ];

  // Each agent gets a curated subset
  return filterToolsForAgent(agentName, allTools);
}

Not every agent gets every tool. We learned (from Riley's research) that LLM performance degrades with too many tools. So each agent gets 7-10 tools max, matched to their role:

Rocky (Chief of Staff): task management, delegation, memory, communication
Drucker (Research): web search, web fetch, file operations
Warhol (Content): email, web search, file operations, newsletter API
Burry (Finance): budget tracking, spreadsheet tools, calculation

We also have tool profiles — full for interactive sessions and lean for autonomous cron jobs. The lean profile strips tools the agent shouldn't use unsupervised (like sending emails or posting to social media).
The Concurrency Problem We Didn't Expect
The HTTP bridge solved SQLite contention. But it introduced a new problem: response callback collision.
A channel subprocess can receive messages from two sources simultaneously:
1. A user DM via Telegram
2. A cron-injected prompt
Both call sendToClaude(), which writes to the subprocess's stdin and sets up a response callback. If two calls overlap, the second callback overwrites the first. The first caller never gets a response.
The fix was a request queue:
class RequestQueue {
  private queue: Array<{ prompt: string; resolve: Function }> = [];
  private processing = false;

  async enqueue(prompt: string): Promise<string> {
    return new Promise((resolve) => {
      this.queue.push({ prompt, resolve });
      if (!this.processing) this.processNext();
    });
  }

  private async processNext() {
    if (this.queue.length === 0) { this.processing = false; return; }
    this.processing = true;
    const { prompt, resolve } = this.queue.shift()!;
    const response = await this.sendToClaude(prompt);
    resolve(response);
    this.processNext(); // Process next in queue
  }
}

This serializes all prompts to a channel subprocess. DMs and cron jobs take turns. No more lost responses.
The Idle Timeout Dance
Channel subprocesses are expensive — each Claude Code instance uses ~200MB. We can't keep them alive forever for agents that get 5 messages a day.
But we also can't kill them mid-thought. If an agent is executing a 10-step tool chain, we need to wait for it to finish.
Our solution: activity-based idle timeout with per-agent tuning.
const IDLE_TIMEOUTS: Record<string, number> = {
  drucker: 180,   // Research = long tool chains
  grove: 150,     // Code review = deep analysis
  edison: 150,    // Building tools = iterative
  rocky: 120,     // Chief of staff = moderate
  tars: 120,      // Engineering = moderate
  default: 90,    // Most agents = short
};

// Reset timeout on ANY activity
function onActivity() {
  clearTimeout(idleTimer);
  idleTimer = setTimeout(() => {
    subprocess.kill();
    fallbackToRelayMode(agentName);
  }, IDLE_TIMEOUTS[agentName] * 1000);
}

And the safety net: a hard cap of 10 minutes regardless of activity. This prevents runaway agents from holding a subprocess hostage.
When a channel subprocess dies (timeout, crash, or OOM), the system automatically falls back to relay mode for that agent. The next message gets processed via the lightweight per-message SDK path. No downtime, no lost messages.
What We'd Do Differently
1. Start with HTTP bridge from day one. We wasted a week trying to make direct SQLite access work from subprocesses. WAL mode, busy timeouts, retry logic — none of it was reliable under real load. The HTTP bridge is simpler and more correct.
2. Schema translation is annoying. Our tools use Zod schemas internally. MCP uses JSON Schema. The translation layer (zodToJsonSchema) handles 90% of cases but occasionally mangles complex union types. If starting over, we'd pick one schema format and stick with it.
3. Per-agent tool budgets are essential. We initially gave every agent every tool. Performance cratered. The sweet spot is 7-10 tools per agent. More than that and the LLM spends tokens deliberating over tool selection instead of executing.
The Numbers

Metric
Before Bridge
After Bridge

SQLite BUSY errors
15-20/day
0

Channel subprocess crashes
3-5/day
<1/day

Tool call latency (p50)
45ms direct
52ms via HTTP

Tool call latency (p99)
2.3s (contention)
89ms

Memory per channel process
~350MB (tools imported)
~200MB (tools via HTTP)

The HTTP bridge adds ~7ms of latency at p50. It eliminates contention-driven tail latency entirely. Worth it.
Next Week
The Budget System — how we built daily token limits with automatic LLM downgrading. When Claude's budget is exhausted, agents automatically fall back to DeepSeek, then Grok, then Ollama. The economics of running 16 agents on $200/month.

Part 2 of a weekly technical series on building production AI agent systems. Written from the trenches — 16 agents, one Mac Mini, Cebu City, Philippines.
Subscribe: buttondown.com/the200dollarceo
Full series: buttondown.com/the200dollarceo/archive

                            Don't miss what's next. Subscribe to The $200/Month CEO:

            Email address (required)

The $200/Month CEO

The MCP Tool Bridge: How We Gave AI Agent Subprocesses Access to 50+ Tools Without Breaking Everything

The MCP Tool Bridge: How We Gave AI Agent Subprocesses Access to 50+ Tools Without Breaking Everything

The Problem

The Solution: An HTTP Bridge Disguised as MCP

Layer 1: The MCP Stdio Server (`tools-server.ts`)

Layer 2: The HTTP API (`routes.ts`)

Layer 3: The Tool Registry (`tools/index.ts`)

The Concurrency Problem We Didn't Expect

The Idle Timeout Dance

What We'd Do Differently

The Numbers

Next Week

Metric	Before Bridge	After Bridge
SQLite BUSY errors	15-20/day	0
Channel subprocess crashes	3-5/day	<1/day
Tool call latency (p50)	45ms direct	52ms via HTTP
Tool call latency (p99)	2.3s (contention)	89ms
Memory per channel process	~350MB (tools imported)	~200MB (tools via HTTP)