Tagged articles
6 articles
Page 1 of 1
ShiZhen AI
ShiZhen AI
Mar 6, 2026 · Artificial Intelligence

GPT-5.4 Beats Human Baseline and Cuts Agent Token Use by Half

OpenAI's newly released GPT-5.4 integrates reasoning, coding, computer use, and agent tool calls, achieving a 75% success rate on OSWorld-Verified tasks—surpassing the human baseline—while its Tool Search feature reduces agent token consumption by 47% and supports up to 1 million tokens for long‑running workflows.

AI modelAgentComputer Use
0 likes · 15 min read
GPT-5.4 Beats Human Baseline and Cuts Agent Token Use by Half
AI Explorer
AI Explorer
Mar 6, 2026 · Artificial Intelligence

GPT-5.4 Unveiled: 1M‑Token Context Window and Native Computer Control

OpenAI's GPT-5.4 launch introduces three model tiers, a 1 million‑token context window, native computer‑use abilities, higher factual accuracy and a new Tool Search feature, reshaping enterprise AI capabilities and intensifying competition with Anthropic and Google.

AI benchmarksComputer UseGPT-5.4
0 likes · 9 min read
GPT-5.4 Unveiled: 1M‑Token Context Window and Native Computer Control
AI Insight Log
AI Insight Log
Jan 15, 2026 · Artificial Intelligence

How Claude Code’s New MCP Tool Search Slashes Tokens and Solves Context Explosion

Claude Code introduces MCP Tool Search, a lazy‑loading mechanism that dynamically loads only needed tools, cutting token usage by over 67,000 tokens in large MCP setups, preventing context bloat, improving performance, and offering developers regex and BM25 search options with defer_loading support.

BM25Claude CodeContext management
0 likes · 6 min read
How Claude Code’s New MCP Tool Search Slashes Tokens and Solves Context Explosion
AI Tech Publishing
AI Tech Publishing
Nov 25, 2025 · Artificial Intelligence

Three New Ways to Tackle Agent Context Engineering with Claude’s Tools

Anthropic’s recent release introduces three advanced capabilities—Tool Search, Programmatic Tool Calling, and Tool Use Examples—that reduce token consumption, avoid context pollution, and improve tool‑calling accuracy for AI agents, with detailed benchmarks, code samples, and guidance on when each feature is most effective.

AI AgentsClaudeContext Engineering
0 likes · 24 min read
Three New Ways to Tackle Agent Context Engineering with Claude’s Tools
Amazon Cloud Developers
Amazon Cloud Developers
Nov 25, 2025 · Artificial Intelligence

Flagship AI Performance at One‑Third Cost: Claude Opus 4.5 on Amazon Bedrock

Claude Opus 4.5, now on Amazon Bedrock, delivers flagship‑level AI capabilities for coding, agent development, and office automation at roughly one‑third the cost of its predecessor, outperforming Sonnet 4.5 and Opus 4.1 on benchmarks such as SWE‑bench (80.9%) and MMMU (80.7%), while offering tool‑search, tool‑example support, and flexible effort settings for production‑grade agents.

AI AgentsAmazon BedrockClaude Opus 4.5
0 likes · 14 min read
Flagship AI Performance at One‑Third Cost: Claude Opus 4.5 on Amazon Bedrock