Artificial Intelligence 6 min read

How Claude Code’s New MCP Tool Search Slashes Tokens and Solves Context Explosion

Claude Code introduces MCP Tool Search, a lazy‑loading mechanism that dynamically loads only needed tools, cutting token usage by over 67,000 tokens in large MCP setups, preventing context bloat, improving performance, and offering developers regex and BM25 search options with defer_loading support.

AI Insight Log

Jan 15, 2026

How Claude Code’s New MCP Tool Search Slashes Tokens and Solves Context Explosion

Claude Code has launched a major update called MCP Tool Search, which adds a lazy‑loading strategy for tools used by AI agents.

When developers attach many MCP servers (seven or more), the tool definitions can consume more than 67,000 tokens, leading to context bloat, higher costs, and degraded instruction‑following performance because the model’s attention is split across a massive prompt.

Thariq noted on social media that “as MCP becomes more popular, we see servers containing 50+ tools that occupy a large portion of the context.”

The solution is on‑demand loading (lazy loading). Claude now monitors how much of the context is taken up by tool descriptions and triggers a loading mechanism when the usage exceeds 10% of the context window.

Auto detection : When tool descriptions exceed the 10% threshold, the system activates.

Dynamic search : Instead of stuffing all tools into the prompt, Claude first searches for relevant tools based on the user’s command and only then loads the matching ones into the context.

This approach is likened to moving a toolbox from “always carried” to a “cloud warehouse” – you summon the right wrench only when you need it.

Advice for MCP server developers : In the new “tool search” mode, Server Instructions become crucial. Developers must write clear instructions that tell Claude when to search for a particular tool, effectively providing an index or skill summary for the toolset.

Advice for MCP client developers : Implement the ToolSearchTool component, which is documented in the official release, to enable client‑side tool searching.

The team experimented with several approaches, including programmatic tool calling, but concluded that reducing context consumption was the top priority, leading to the launch of tool search.

Technical Details: Regex vs BM25

Tool Search supports two variants:

Regex variant ( tool_search_tool_regex): Claude builds a Python regular expression (e.g., get_.*_data) to locate tools. This method is highly precise and token‑efficient.

BM25 variant ( tool_search_tool_bm25): Uses natural‑language queries for more flexible matching, though it may be less “hard‑core” than regex.

Setting the parameter defer_loading: true marks tools for lazy loading, so they are only added to the context after a successful search.

For ordinary users, this means they can attach many more MCP servers without fearing token explosion; for developers, it opens the possibility of building massive, scalable agent systems.

In short, the update eliminates token anxiety and streamlines context management for Claude Code.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

MCP BM25 Lazy Loading Regex Context Management Claude Code Tool Search

Written by

AI Insight Log

Focused on sharing: AI programming | Agents | Tools

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.