How Anthropic’s Advisor Strategy Boosts Sonnet Scores by 2.7% While Cutting Costs 12%
Anthropic’s new advisor strategy flips the traditional multi‑agent model by letting a cheap front‑line model call Opus for advice only when needed, delivering a 2.7 percentage‑point score lift on SWE‑bench, a 12 % cost reduction, and a simple one‑line API integration, while also outlining its limitations and future implications.
A reverse multi‑agent design
To understand the advisor strategy, first review the usual multi‑agent pattern where a large model acts as an orchestrator and delegates subtasks to smaller worker models. Frameworks such as LangGraph, CrewAI, and AutoGen help write this orchestration logic.
The advisor strategy inverts this: the small model (Sonnet or Haiku) drives the task end‑to‑end, calling tools and iterating on its own. Only when it encounters uncertainty does it pause and ask Opus for advice. Opus never calls tools or returns output to the user; it only supplies a suggestion or plan.
This inversion eliminates the need for explicit task decomposition, worker‑pool scheduling, and manual context synchronization. The entire flow completes in a single API request, with the front‑line model deciding when to summon the advisor.
Anthropic summarizes the benefit as “a smaller, more cost‑effective model drives and escalates without decomposition, a worker pool, or orchestration logic.”
Benchmark data: it actually works
Anthropic provides comparative numbers on the SWE‑bench Multilingual benchmark:
Sonnet 4.6 High (solo): 72.1 % score, $1.09 per task
Sonnet 4.6 High + Opus advisor: 74.8 % score, $0.96 per task
The addition of the Opus advisor raises the score by 2.7 percentage points while reducing per‑task cost by 11.9 %.
On the BrowseComp test, Haiku alone scores 19.7 %, whereas Haiku + Opus advisor reaches 41.2 %, more than double the score while costing only 15 % of Sonnet’s price.
Why not just use Opus
Opus is much more expensive than Sonnet—its input‑output pricing is several times higher. If 95 % of a task can be handled by Sonnet, using Opus for the whole job would waste money.
The advisor strategy ensures that frontier‑level reasoning is charged only when the executor truly needs it; the rest of the run stays at executor‑level cost, mirroring how junior staff consult senior advisors only for difficult problems.
Technical implementation: just one line of code
Integrating the advisor is a single API call. Example code:
response = client.messages.create(
model="claude-sonnet-4-6", # executor
tools=[
{
"type": "advisor_20260301",
"name": "advisor",
"model": "claude-opus-4-6",
"max_uses": 3,
},
# ... other tools you already use
],
messages=[...]
)The advisor is defined as a special tool. The type field identifies the advisor, model selects the Opus model, and max_uses caps how many times the advisor can be invoked per request, preventing runaway costs.
After the request is sent, Anthropic’s servers handle all coordination. The front‑line model decides when to call the advisor; the platform routes context to Opus, receives a 400‑700‑token suggestion, and returns it to the executor—all within a single /v1/messages request, with no extra network round‑trips or session state to manage.
Billing is split: the advisor portion is charged at Opus rates, the executor portion at Sonnet/Haiku rates, and the usage field reports advisor token consumption separately.
Boundaries of this strategy
The key unknown is the front‑line model’s “self‑awareness” – its ability to know when to ask for help. If it’s overconfident, it may skip needed advice; if overly cautious, costs rise. The max_uses limit is a safety valve, but optimal settings require user experimentation.
For simple tasks where Sonnet alone suffices, adding an advisor only adds cost. Anthropic cites Bolt CEO Eric Simmons: “Better architectural decisions on complex tasks, no extra overhead on simple ones,” assuming the model can correctly judge task complexity.
The feature is still in beta; edge cases such as misguided advice can still occur and need real‑world tuning.
Embedding the advisor ties the code to Anthropic’s model stack, making future vendor switches require rewrites, which Anthropic views as an ecosystem moat.
Model layering era is arriving
The advisor strategy is part of Anthropic’s broader model‑layering roadmap, which includes Opus 4.6, Sonnet 4.6, Haiku 4.5, Managed Agents, long context, and prompt caching. The idea is to match tasks and budgets with appropriate model intelligence.
Other providers have similar tiered offerings (OpenAI’s GPT‑5 series, Google Gemini’s Pro/Flash/Nano), but Anthropic’s advisor embeds model collaboration directly into the API, eliminating the need for developers to write orchestration code.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
