The MCP Tool Bridge: How We Gave AI Agent Subprocesses Access to 50+ Tools Without Breaking Everything
The MCP Tool Bridge: How We Gave AI Agent Subprocesses Access to 50+ Tools Without Breaking Everything
Issue #26 — The Rocky Relay Architecture Series
Last week I showed you how we evolved from 6 chatbots to a 16-agent AI operating system. I ended with a teaser about the hardest problem we solved: giving a Claude Code subprocess access to the relay's tools without SQLite contention or shared memory.
This is that story.
The Problem
Our architecture has two execution modes. Relay mode agents run inside the main process — they can call tools directly because they share memory with the tool registry. Easy.
Channels mode agents (Rocky, Edison, Warhol) run as persistent Claude Code subprocesses. They're separate OS processes. They can't import the tool registry. They can't touch the SQLite database directly. If they did, we'd get:
- SQLite multi-process contention — WAL mode helps, but concurrent writes from multiple processes still cause
SQLITE_BUSYerrors under load - State corruption — tools that modify shared state (task pipeline, agent memory, budget tracking) need to run in one place
- Dependency explosion — the tool registry imports the entire warroom: database, Telegram clients, API keys, LLM providers. A subprocess importing all that would double our memory footprint
We needed a way for channel subprocesses to call 50+ tools that live in the relay process, without importing any of them.
The Solution: An HTTP Bridge Disguised as MCP
The answer is embarrassingly simple once you see it:
Claude Code Process (channels mode)
↓ MCP stdio protocol
tools-server.ts (thin shim)
↓ HTTP POST
localhost:3847/api/internal/tools/rocky/task_create
↓
Relay process (where tools actually live)
↓
executeAgentTool() → runs in-process
Three layers. Let me break them down.
Layer 1: The MCP Stdio Server (tools-server.ts)
When a channel subprocess starts, it spawns a tiny MCP server via --mcp-config:
// tools-server.ts — The bridge between Claude Code and the relay
import { McpServer } from "@modelcontextprotocol/sdk/server/mcp.js";
import { StdioServerTransport } from "@modelcontextprotocol/sdk/server/stdio.js";
const RELAY_BASE = `http://localhost:${process.env.RELAY_PORT}`;
const AGENT_NAME = process.env.AGENT_NAME;
// Step 1: Fetch available tools from relay's HTTP API
const res = await fetch(`${RELAY_BASE}/api/internal/tools/${AGENT_NAME}`);
const tools = await res.json();
// Step 2: Register each tool with the MCP server
for (const tool of tools) {
server.tool(tool.name, tool.description, tool.inputSchema, async (params) => {
// Step 3: Execute by calling back to the relay
const result = await fetch(`${RELAY_BASE}/api/internal/tools/${AGENT_NAME}/${tool.name}`, {
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify(params),
});
return { content: [{ type: "text", text: await result.text() }] };
});
}
This server does nothing itself. It's a passthrough. It asks the relay "what tools does this agent have?", registers them as MCP tools, and when Claude Code calls one, it POSTs back to the relay.
Why this works: Claude Code natively supports MCP. It already knows how to discover and call MCP tools. We just made an MCP server whose backend is an HTTP API instead of local functions.
Layer 2: The HTTP API (routes.ts)
The relay exposes two internal endpoints:
// GET /api/internal/tools/:agent — List available tools
router.get("/api/internal/tools/:agent", (req) => {
const tools = getAgentRawTools(req.params.agent);
return Response.json(tools.map(t => ({
name: t.name,
description: t.description,
inputSchema: zodToJsonSchema(t.schema), // Zod → JSON Schema for MCP
})));
});
// POST /api/internal/tools/:agent/:tool — Execute a tool
router.post("/api/internal/tools/:agent/:tool", async (req) => {
const args = await req.json();
const result = await executeAgentTool(req.params.agent, req.params.tool, args);
return new Response(JSON.stringify(result));
});
The key detail: executeAgentTool() runs in the relay process. It has access to the SQLite database, the Telegram client, the budget tracker, everything. The channel subprocess never touches any of it directly.
Layer 3: The Tool Registry (tools/index.ts)
This is where agent-specific tool selection happens:
function getAgentRawTools(agentName: string): Tool[] {
const allTools = [
taskCreate, taskList, taskUpdate, // Task pipeline
memoryRead, memoryWrite, // Agent memory
webSearch, webFetch, // Web access
sendTelegram, sendEmail, // Communication
budgetCheck, budgetReport, // Cost tracking
// ... 40+ more tools
];
// Each agent gets a curated subset
return filterToolsForAgent(agentName, allTools);
}
Not every agent gets every tool. We learned (from Riley's research) that LLM performance degrades with too many tools. So each agent gets 7-10 tools max, matched to their role:
- Rocky (Chief of Staff): task management, delegation, memory, communication
- Drucker (Research): web search, web fetch, file operations
- Warhol (Content): email, web search, file operations, newsletter API
- Burry (Finance): budget tracking, spreadsheet tools, calculation
We also have tool profiles — full for interactive sessions and lean for autonomous cron jobs. The lean profile strips tools the agent shouldn't use unsupervised (like sending emails or posting to social media).
The Concurrency Problem We Didn't Expect
The HTTP bridge solved SQLite contention. But it introduced a new problem: response callback collision.
A channel subprocess can receive messages from two sources simultaneously: 1. A user DM via Telegram 2. A cron-injected prompt
Both call sendToClaude(), which writes to the subprocess's stdin and sets up a response callback. If two calls overlap, the second callback overwrites the first. The first caller never gets a response.
The fix was a request queue:
class RequestQueue {
private queue: Array<{ prompt: string; resolve: Function }> = [];
private processing = false;
async enqueue(prompt: string): Promise<string> {
return new Promise((resolve) => {
this.queue.push({ prompt, resolve });
if (!this.processing) this.processNext();
});
}
private async processNext() {
if (this.queue.length === 0) { this.processing = false; return; }
this.processing = true;
const { prompt, resolve } = this.queue.shift()!;
const response = await this.sendToClaude(prompt);
resolve(response);
this.processNext(); // Process next in queue
}
}
This serializes all prompts to a channel subprocess. DMs and cron jobs take turns. No more lost responses.
The Idle Timeout Dance
Channel subprocesses are expensive — each Claude Code instance uses ~200MB. We can't keep them alive forever for agents that get 5 messages a day.
But we also can't kill them mid-thought. If an agent is executing a 10-step tool chain, we need to wait for it to finish.
Our solution: activity-based idle timeout with per-agent tuning.
const IDLE_TIMEOUTS: Record<string, number> = {
drucker: 180, // Research = long tool chains
grove: 150, // Code review = deep analysis
edison: 150, // Building tools = iterative
rocky: 120, // Chief of staff = moderate
tars: 120, // Engineering = moderate
default: 90, // Most agents = short
};
// Reset timeout on ANY activity
function onActivity() {
clearTimeout(idleTimer);
idleTimer = setTimeout(() => {
subprocess.kill();
fallbackToRelayMode(agentName);
}, IDLE_TIMEOUTS[agentName] * 1000);
}
And the safety net: a hard cap of 10 minutes regardless of activity. This prevents runaway agents from holding a subprocess hostage.
When a channel subprocess dies (timeout, crash, or OOM), the system automatically falls back to relay mode for that agent. The next message gets processed via the lightweight per-message SDK path. No downtime, no lost messages.
What We'd Do Differently
1. Start with HTTP bridge from day one. We wasted a week trying to make direct SQLite access work from subprocesses. WAL mode, busy timeouts, retry logic — none of it was reliable under real load. The HTTP bridge is simpler and more correct.
2. Schema translation is annoying. Our tools use Zod schemas internally. MCP uses JSON Schema. The translation layer (zodToJsonSchema) handles 90% of cases but occasionally mangles complex union types. If starting over, we'd pick one schema format and stick with it.
3. Per-agent tool budgets are essential. We initially gave every agent every tool. Performance cratered. The sweet spot is 7-10 tools per agent. More than that and the LLM spends tokens deliberating over tool selection instead of executing.
The Numbers
| Metric | Before Bridge | After Bridge |
|---|---|---|
| SQLite BUSY errors | 15-20/day | 0 |
| Channel subprocess crashes | 3-5/day | <1/day |
| Tool call latency (p50) | 45ms direct | 52ms via HTTP |
| Tool call latency (p99) | 2.3s (contention) | 89ms |
| Memory per channel process | ~350MB (tools imported) | ~200MB (tools via HTTP) |
The HTTP bridge adds ~7ms of latency at p50. It eliminates contention-driven tail latency entirely. Worth it.
Next Week
The Budget System — how we built daily token limits with automatic LLM downgrading. When Claude's budget is exhausted, agents automatically fall back to DeepSeek, then Grok, then Ollama. The economics of running 16 agents on $200/month.
Part 2 of a weekly technical series on building production AI agent systems. Written from the trenches — 16 agents, one Mac Mini, Cebu City, Philippines.
Subscribe: buttondown.com/the200dollarceo Full series: buttondown.com/the200dollarceo/archive