Stop giving your agent every tool
This week's video breaks down tool search, context bloat, and the search-load-call pattern for large agent tool surfaces. Plus two related reads on why tool retrieval is becoming a core architecture layer.
This week's video
A lot of agent builds follow the same path. You add one tool, then another, then a few MCP servers, and before long the model is dragging around a catalog it barely needs. Nothing obviously breaks. The agent just gets worse at choosing the right tool while spending more context on schemas than on the task.
This week's video covers the pattern that fixes that. Tool search separates discovery from loading, so the agent does not need every tool definition in the prompt on every turn. I show the first version I built in Emma, why I replaced it, and what the loop looks like once Mastra's ToolSearchProcessor is in place.
You'll see the exact shift: the agent starts without direct access to the full catalog, calls search_tools, loads the specific tool it needs with load_tool, then uses that tool normally. In the demo, that means finding search-web only when the task actually requires it. The important point is not Mastra itself. It's the move from dumping every capability into the prompt to retrieving only what the task needs.
Resources mentioned: - Mastra - Mastra ToolSearchProcessor docs - Anthropic tool search docs - Anthropic: Introducing advanced tool use
The real constraint is not tool count, it's prompt visibility
Tool availability and tool visibility are different problems.
A framework can make hundreds of tools available to an agent. That does not mean the model should see all of them at once. Once the tool surface gets large enough, you pay twice. First in raw context cost. Then in worse selection quality. The model has more choices, more descriptions to parse, and more chances to grab the wrong thing.
Anthropic's tool search docs make this concrete. In a typical multi-server setup, tool definitions alone can eat tens of thousands of tokens before the agent does any useful work. That is not a model problem. It is a harness problem. If you treat tools as a retrieval layer instead of a static appendix to the prompt, the system gets cleaner fast.
That is also why I like tracing this loop instead of treating it as magic. Search, load, call. You can inspect each step. You can see whether the agent searched badly, loaded the wrong tool, or used the right tool with the wrong arguments. Once the loop is visible, it becomes fixable.
Two related reads worth your time
Anthropic's tool search documentation. This is the clearest vendor writeup I've seen on the scaling problem itself. The value is not just the feature. It is the framing: large tool surfaces create both context bloat and selection degradation, and those problems compound. If you're building agents with more than a handful of tools, this is worth reading closely.
Google's MCP whitepaper on tools and interoperability. One of the strongest sections is the warning about context window bloat. Loading every tool definition from every connected server does not scale, and a retrieval layer for tools is the natural next step. The same paper is also a good reminder that dynamic tool discovery changes the system's risk profile. When tools can appear or expand over time, governance matters as much as convenience.
If you're building agents with growing tool surfaces, reply and tell me where the pain shows up first: context cost, bad tool selection, or lack of visibility into what the agent is doing. If you want a structured way to diagnose the harness, the 4 Levers Agent Diagnostic is a good place to start.
If your team is dealing with this in production and wants help tightening the harness, improving tool selection, or designing a cleaner architecture around large tool catalogs, book an intro call.
Damian

Add a comment: