Tagged articles

token cost

27 articles · Page 1 of 1
Su San Talks Tech
Su San Talks Tech
Jul 1, 2026 · Artificial Intelligence

Which Domestic Multimodal LLM Is the Most Efficient for Production?

The article benchmarks three Chinese multimodal large models—Step 3.7 Flash, MiniMax M3, and Qwen 3.6‑flash—across two real‑world tasks, measuring output quality, API latency, and token cost, and concludes that Step 3.7 Flash consistently offers the best speed‑cost trade‑off for production use.

API latencyBenchmarkMiniMax M3
0 likes · 10 min read
Which Domestic Multimodal LLM Is the Most Efficient for Production?
DataFunSummit
DataFunSummit
Jul 1, 2026 · Artificial Intelligence

Deploying AI Agents: Protocols, Costs, and Evolution from Demo to Production

A 90‑minute live discussion with three industry experts dissects why AI agents often stall after a successful demo, examining protocol collaboration, self‑evolution capabilities, and token‑cost control, while offering concrete engineering, management, and business‑value insights for enterprise AI adoption.

AI agentsAI codingEnterprise AI
0 likes · 18 min read
Deploying AI Agents: Protocols, Costs, and Evolution from Demo to Production
AI Engineering
AI Engineering
Jul 1, 2026 · Artificial Intelligence

Claude Sonnet 5 Is Stronger Yet Costlier—Per‑Task Cost Beats Opus 4.8

Anthropic’s newly released Claude Sonnet 5 scores 53 on the Artificial Analysis intelligence index, surpassing Sonnet 4.6 and matching GPT‑5.5, but its per‑task cost rises to $2.29—15 % higher than Opus 4.8—due to roughly 40 % more output tokens and increased agentic interaction rounds.

AI model benchmarkAnthropicClaude Sonnet 5
0 likes · 5 min read
Claude Sonnet 5 Is Stronger Yet Costlier—Per‑Task Cost Beats Opus 4.8
AI Architecture Hub
AI Architecture Hub
Jun 25, 2026 · Artificial Intelligence

Loop Engineering: The Essential Skill Every AI Developer Needs by 2026

The article explains how AI developers must move from manually feeding prompts to building automated feedback loops—called loop engineering—detailing token cost challenges, loop architectures, open vs. closed designs, six core modules, and practical examples that illustrate this shift.

AI agentsAutomationClaude
0 likes · 14 min read
Loop Engineering: The Essential Skill Every AI Developer Needs by 2026
21CTO
21CTO
Jun 17, 2026 · Artificial Intelligence

Why Claude Code’s Lead Abandoned Prompts for Loop Engineering

Loop engineering—an automated agent workflow that replaces manual prompting—has reshaped how developers will use Claude Code and OpenAI Codex by 2026, introducing six core building blocks, token‑cost trade‑offs, and a new emphasis on validation and understanding debt.

AI agentsAutomationClaude Code
0 likes · 8 min read
Why Claude Code’s Lead Abandoned Prompts for Loop Engineering
AI Architecture Hub
AI Architecture Hub
Jun 17, 2026 · Artificial Intelligence

Stop Misusing AI Agent Loops: Why Most Fail Early and How to Use Them Correctly

The article explains the two main AI Agent Loop patterns—human‑in‑the‑loop and fully autonomous agentic loops—highlights the hidden costs, product‑drift risks, and budget limits of the latter, and provides concrete, low‑risk scenarios and a step‑by‑step code‑review loop that keeps humans in control.

AI Agent LoopAI productivityAgentic Loop
0 likes · 9 min read
Stop Misusing AI Agent Loops: Why Most Fail Early and How to Use Them Correctly
SuanNi
SuanNi
Jun 13, 2026 · Artificial Intelligence

Why You Should Stop Hand‑Writing Prompts: Loop Engineering Lets AI Run Itself

The article explains Loop Engineering—a three‑layered approach that moves AI from manual prompt writing to autonomous loops, detailing its core components, practical implementations in Codex and Claude Code, and the trade‑offs such as token cost, comprehension debt, and design complexity.

AI agentsAutomationLoop Engineering
0 likes · 12 min read
Why You Should Stop Hand‑Writing Prompts: Loop Engineering Lets AI Run Itself
Java Tech Enthusiast
Java Tech Enthusiast
Jun 5, 2026 · Artificial Intelligence

Which AI Coding Agent Reigns Supreme in 2026? A Comparative Ranking of Cursor, Claude Code, and Codex

The article presents a detailed 2026 benchmark of major AI coding agents—Cursor CLI, Claude Code, OpenAI Codex and others—evaluating them across performance, token consumption, cost per task and execution time, and reveals that the top three differ by only one point, shifting the competition toward efficiency and latency.

AI coding agentsClaude CodeCursor CLI
0 likes · 7 min read
Which AI Coding Agent Reigns Supreme in 2026? A Comparative Ranking of Cursor, Claude Code, and Codex
IT Services Circle
IT Services Circle
Jun 1, 2026 · Artificial Intelligence

Why Developers Are Abandoning Markdown for HTML in the AI Era

In the era of AI agents like Claude Code, developers are shifting from Markdown to single‑file HTML because Markdown cannot efficiently convey complex architecture diagrams, high‑density information, or interactive UI elements, leading to slower workflows and higher token costs.

AI agentsClaudeHTML
0 likes · 10 min read
Why Developers Are Abandoning Markdown for HTML in the AI Era
Geek Labs
Geek Labs
May 29, 2026 · Artificial Intelligence

How Much Do AI Coding Tools Really Cost? Compare cc-statistics and AgentsView

This article introduces two open‑source projects—cc-statistics and AgentsView—that locally track token usage, costs, and session history across popular AI coding tools, compares their features in detail, provides quick‑start commands, and advises which tool fits different workflows.

AI coding toolsOpen-sourceWeb UI
0 likes · 9 min read
How Much Do AI Coding Tools Really Cost? Compare cc-statistics and AgentsView
SuanNi
SuanNi
May 28, 2026 · Industry Insights

Xiaomi Slashes Token Prices by Up to 99% to Match DeepSeek’s API Pricing

The article analyzes the recent AI API price war, detailing DeepSeek’s step‑by‑step token‑price reductions, Xiaomi’s 99% cut that aligns its MiMo‑V2.5 Pro tier with DeepSeek, the underlying technical optimizations that enable lower costs, and the broader market shift toward cost‑driven competition.

AI pricingAPI competitionDeepSeek
0 likes · 7 min read
Xiaomi Slashes Token Prices by Up to 99% to Match DeepSeek’s API Pricing
James' Growth Diary
James' Growth Diary
May 25, 2026 · Artificial Intelligence

Practical Agent Performance Tuning: Slash Latency 75%, Cut Token Costs 71%, Boost Throughput 217%

The article walks through a systematic performance map of LangChain agents and demonstrates concrete latency, token‑usage, and concurrency optimizations—streaming responses, Redis caching, model routing, prompt trimming, context summarisation, dynamic tool selection, parallel graph nodes and batch processing—showing real‑world gains of up to 75% lower latency, 71% fewer tokens and a 217% throughput increase.

Agent OptimizationLangChainLangGraph
0 likes · 30 min read
Practical Agent Performance Tuning: Slash Latency 75%, Cut Token Costs 71%, Boost Throughput 217%
IT Services Circle
IT Services Circle
May 19, 2026 · Artificial Intelligence

Peter Steinberger’s $1.3 M Monthly Token Bill: OpenAI’s Subsidy Powers a 100‑Agent OpenClaw

Peter Steinberger revealed that his OpenAI API usage cost $1.3 million in the past 30 days, consuming 6 030 billion tokens across 7.6 million requests, most of which power a cloud‑run fleet of about 100 Codex agents that automate OpenClaw development, prompting a debate on AI‑driven software costs.

AI EngineeringCodexOpenAI
0 likes · 7 min read
Peter Steinberger’s $1.3 M Monthly Token Bill: OpenAI’s Subsidy Powers a 100‑Agent OpenClaw
Linyb Geek Road
Linyb Geek Road
May 9, 2026 · Artificial Intelligence

Why Overly Long Context Files Reduce AI Agent Success by 3% and Raise Token Cost 20%

The article shows that adding redundant context to AI agents like Claude harms efficiency: each extra 50 lines dilutes attention, lowers task success by about 3 % and inflates token usage by roughly 20 %, because the model’s instruction budget is capped at 150‑200 tokens, so context files must be concise and focused on non‑derivable information.

AGENTS.mdAI agentsCLAUDE.md
0 likes · 17 min read
Why Overly Long Context Files Reduce AI Agent Success by 3% and Raise Token Cost 20%
Old Meng AI Explorer
Old Meng AI Explorer
May 4, 2026 · Artificial Intelligence

8 Essential Claude Code Slash Commands to Double Your Productivity

The article reveals eight high‑frequency slash commands for Claude Code—/init, /clear, /compact, /model, /permissions, /review, /cost, and /memory—explaining when and how to use each, providing concrete code examples, and showing how they can dramatically improve development efficiency and reduce token costs.

AI coding assistantClaude CodeSlash Commands
0 likes · 11 min read
8 Essential Claude Code Slash Commands to Double Your Productivity
AI Engineering
AI Engineering
Apr 13, 2026 · Artificial Intelligence

Why Your Tokens Burn Money Fast and How a Four‑Tier Model Stack Can Cut Costs

The article examines the rapid token consumption problem caused by popular LLM agents, proposes a four‑tier model hierarchy and concrete routing rules, and offers short‑term, long‑term, and budget‑friendly deployment recommendations to reduce expenses while maintaining performance.

LLMMulti‑model deploymentmodel tiering
0 likes · 7 min read
Why Your Tokens Burn Money Fast and How a Four‑Tier Model Stack Can Cut Costs
AndroidPub
AndroidPub
Apr 13, 2026 · Artificial Intelligence

Why AI Agents Are Abandoning Model Context Protocol for CLI‑First Toolchains

In early 2026 the AI community witnessed a sharp shift away from Model Context Protocol (MCP) toward CLI‑first approaches, driven by token‑cost inflation, fragmented authentication, and loss of composability, with developers favoring the lightweight, text‑based nature of command‑line tools for building robust agent pipelines.

CLIModel Context Protocolauthentication
0 likes · 15 min read
Why AI Agents Are Abandoning Model Context Protocol for CLI‑First Toolchains
Machine Heart
Machine Heart
Mar 30, 2026 · Artificial Intelligence

Is OpenClaw the Early Linux of AI Agents? A Deep Dive into Its Real Challenges

The article analyzes OpenClaw’s rapid rise, arguing that its impact stems from engineering integration that lowers the usability threshold for AI agents, while highlighting core bottlenecks such as reliability, long‑task execution, token cost, memory architecture, and the need for end‑cloud collaboration.

AI agentsOpenClawagent operating system
0 likes · 24 min read
Is OpenClaw the Early Linux of AI Agents? A Deep Dive into Its Real Challenges
Baidu Geek Talk
Baidu Geek Talk
Mar 25, 2026 · Artificial Intelligence

Master OpenClaw: Install, Configure, and Scale Multi‑Agent AI Automation

An in‑depth guide walks you through installing OpenClaw, understanding its gateway, channel, agent, tool, and skill architecture, managing token costs, creating multi‑agent workflows, securing API keys, and troubleshooting common issues, empowering developers to build scalable AI‑driven automation.

AutomationInstallationOpenClaw
0 likes · 26 min read
Master OpenClaw: Install, Configure, and Scale Multi‑Agent AI Automation
AI Step-by-Step
AI Step-by-Step
Mar 14, 2026 · Cloud Computing

Choosing Between OpenClaw and Platform‑Built xxclaw for Cloud Deployment

The article compares deploying the OpenClaw open‑source core versus using a platform’s built‑in xxclaw, evaluating server ownership, token costs, capability boundaries, and long‑term controllability to help readers decide which cloud route best fits their needs.

OpenClawServer ManagementTencent Cloud
0 likes · 11 min read
Choosing Between OpenClaw and Platform‑Built xxclaw for Cloud Deployment
Ximalaya Technology Team
Ximalaya Technology Team
Aug 22, 2023 · Artificial Intelligence

Guidelines and Best Practices for Prompt Engineering with Large Language Models

The guide outlines prompt‑engineering best practices for large language models, distinguishing base and instruction‑tuned LLMs, emphasizing clear, structured, step‑by‑step prompts, handling hallucinations, iterating through idea‑code‑data cycles, applying techniques to summarization, reasoning and expansion, managing token costs, and providing concrete OpenAI API examples.

AIAPI usageLLM
0 likes · 14 min read
Guidelines and Best Practices for Prompt Engineering with Large Language Models