Tagged articles

token cost

27 articles · Page 1 of 1

Jul 1, 2026 · Artificial Intelligence

Which Domestic Multimodal LLM Is the Most Efficient for Production?

The article benchmarks three Chinese multimodal large models—Step 3.7 Flash, MiniMax M3, and Qwen 3.6‑flash—across two real‑world tasks, measuring output quality, API latency, and token cost, and concludes that Step 3.7 Flash consistently offers the best speed‑cost trade‑off for production use.

API latencyBenchmarkMiniMax M3

0 likes · 10 min read

Which Domestic Multimodal LLM Is the Most Efficient for Production?

DataFunSummit

Jul 1, 2026 · Artificial Intelligence

Deploying AI Agents: Protocols, Costs, and Evolution from Demo to Production

A 90‑minute live discussion with three industry experts dissects why AI agents often stall after a successful demo, examining protocol collaboration, self‑evolution capabilities, and token‑cost control, while offering concrete engineering, management, and business‑value insights for enterprise AI adoption.

AI agentsAI codingEnterprise AI

0 likes · 18 min read

Deploying AI Agents: Protocols, Costs, and Evolution from Demo to Production

AI Engineering

Jul 1, 2026 · Artificial Intelligence

Claude Sonnet 5 Is Stronger Yet Costlier—Per‑Task Cost Beats Opus 4.8

Anthropic’s newly released Claude Sonnet 5 scores 53 on the Artificial Analysis intelligence index, surpassing Sonnet 4.6 and matching GPT‑5.5, but its per‑task cost rises to $2.29—15 % higher than Opus 4.8—due to roughly 40 % more output tokens and increased agentic interaction rounds.

AI model benchmarkAnthropicClaude Sonnet 5

0 likes · 5 min read

Claude Sonnet 5 Is Stronger Yet Costlier—Per‑Task Cost Beats Opus 4.8

DataFunSummit

Jun 27, 2026 · Artificial Intelligence

Uncovering the Realities of Enterprise Agent Deployment: Protocols, Costs, and Evolution

In a 90‑minute panel, three industry experts dissect the practical gaps of moving AI agents from demo to production, highlighting protocol coordination, hidden token costs, workflow redesign, and the evolving role of engineering and product teams in enterprise AI adoption.

AI agentsAI codingSoftware Architecture

0 likes · 19 min read

Uncovering the Realities of Enterprise Agent Deployment: Protocols, Costs, and Evolution

AI Architecture Hub

Jun 25, 2026 · Artificial Intelligence

Loop Engineering: The Essential Skill Every AI Developer Needs by 2026

The article explains how AI developers must move from manually feeding prompts to building automated feedback loops—called loop engineering—detailing token cost challenges, loop architectures, open vs. closed designs, six core modules, and practical examples that illustrate this shift.

AI agentsAutomationClaude

0 likes · 14 min read

Loop Engineering: The Essential Skill Every AI Developer Needs by 2026

21CTO

Jun 17, 2026 · Artificial Intelligence

Why Claude Code’s Lead Abandoned Prompts for Loop Engineering

Loop engineering—an automated agent workflow that replaces manual prompting—has reshaped how developers will use Claude Code and OpenAI Codex by 2026, introducing six core building blocks, token‑cost trade‑offs, and a new emphasis on validation and understanding debt.

AI agentsAutomationClaude Code

0 likes · 8 min read

Why Claude Code’s Lead Abandoned Prompts for Loop Engineering

AI Architecture Hub

Jun 17, 2026 · Artificial Intelligence

Stop Misusing AI Agent Loops: Why Most Fail Early and How to Use Them Correctly

The article explains the two main AI Agent Loop patterns—human‑in‑the‑loop and fully autonomous agentic loops—highlights the hidden costs, product‑drift risks, and budget limits of the latter, and provides concrete, low‑risk scenarios and a step‑by‑step code‑review loop that keeps humans in control.

AI Agent LoopAI productivityAgentic Loop

0 likes · 9 min read

Stop Misusing AI Agent Loops: Why Most Fail Early and How to Use Them Correctly

SuanNi

Jun 13, 2026 · Artificial Intelligence

Why You Should Stop Hand‑Writing Prompts: Loop Engineering Lets AI Run Itself

The article explains Loop Engineering—a three‑layered approach that moves AI from manual prompt writing to autonomous loops, detailing its core components, practical implementations in Codex and Claude Code, and the trade‑offs such as token cost, comprehension debt, and design complexity.

AI agentsAutomationLoop Engineering

0 likes · 12 min read

Why You Should Stop Hand‑Writing Prompts: Loop Engineering Lets AI Run Itself

Fighter's World

Jun 6, 2026 · Industry Insights

Inference Foundry: Token Physical Cost and Exploding Demand Force Heterogeneous Division

The article analyzes how the immutable physical cost of each AI token and the exponential rise in inference demand outpace hardware improvements, driving a shift toward heterogeneous compute architectures, disaggregation, and ultimately an inference foundry model exemplified by NVIDIA's rapid acquisition of Groq.

AI inferenceGroqNVIDIA

0 likes · 26 min read

Inference Foundry: Token Physical Cost and Exploding Demand Force Heterogeneous Division

Java Tech Enthusiast

Jun 5, 2026 · Artificial Intelligence

Which AI Coding Agent Reigns Supreme in 2026? A Comparative Ranking of Cursor, Claude Code, and Codex

The article presents a detailed 2026 benchmark of major AI coding agents—Cursor CLI, Claude Code, OpenAI Codex and others—evaluating them across performance, token consumption, cost per task and execution time, and reveals that the top three differ by only one point, shifting the competition toward efficiency and latency.

AI coding agentsClaude CodeCursor CLI

0 likes · 7 min read

Which AI Coding Agent Reigns Supreme in 2026? A Comparative Ranking of Cursor, Claude Code, and Codex

IT Services Circle

Jun 1, 2026 · Artificial Intelligence

Why Developers Are Abandoning Markdown for HTML in the AI Era

In the era of AI agents like Claude Code, developers are shifting from Markdown to single‑file HTML because Markdown cannot efficiently convey complex architecture diagrams, high‑density information, or interactive UI elements, leading to slower workflows and higher token costs.

AI agentsClaudeHTML

0 likes · 10 min read

Why Developers Are Abandoning Markdown for HTML in the AI Era

Geek Labs

May 29, 2026 · Artificial Intelligence

How Much Do AI Coding Tools Really Cost? Compare cc-statistics and AgentsView

This article introduces two open‑source projects—cc-statistics and AgentsView—that locally track token usage, costs, and session history across popular AI coding tools, compares their features in detail, provides quick‑start commands, and advises which tool fits different workflows.

AI coding toolsOpen-sourceWeb UI

0 likes · 9 min read

How Much Do AI Coding Tools Really Cost? Compare cc-statistics and AgentsView

SuanNi

May 28, 2026 · Industry Insights

Xiaomi Slashes Token Prices by Up to 99% to Match DeepSeek’s API Pricing

The article analyzes the recent AI API price war, detailing DeepSeek’s step‑by‑step token‑price reductions, Xiaomi’s 99% cut that aligns its MiMo‑V2.5 Pro tier with DeepSeek, the underlying technical optimizations that enable lower costs, and the broader market shift toward cost‑driven competition.

AI pricingAPI competitionDeepSeek

0 likes · 7 min read

Xiaomi Slashes Token Prices by Up to 99% to Match DeepSeek’s API Pricing

James' Growth Diary

May 25, 2026 · Artificial Intelligence

Practical Agent Performance Tuning: Slash Latency 75%, Cut Token Costs 71%, Boost Throughput 217%

The article walks through a systematic performance map of LangChain agents and demonstrates concrete latency, token‑usage, and concurrency optimizations—streaming responses, Redis caching, model routing, prompt trimming, context summarisation, dynamic tool selection, parallel graph nodes and batch processing—showing real‑world gains of up to 75% lower latency, 71% fewer tokens and a 217% throughput increase.

Agent OptimizationLangChainLangGraph

0 likes · 30 min read

Practical Agent Performance Tuning: Slash Latency 75%, Cut Token Costs 71%, Boost Throughput 217%

IT Services Circle

May 19, 2026 · Artificial Intelligence

Peter Steinberger’s $1.3 M Monthly Token Bill: OpenAI’s Subsidy Powers a 100‑Agent OpenClaw

Peter Steinberger revealed that his OpenAI API usage cost $1.3 million in the past 30 days, consuming 6 030 billion tokens across 7.6 million requests, most of which power a cloud‑run fleet of about 100 Codex agents that automate OpenClaw development, prompting a debate on AI‑driven software costs.

AI EngineeringCodexOpenAI

0 likes · 7 min read

Peter Steinberger’s $1.3 M Monthly Token Bill: OpenAI’s Subsidy Powers a 100‑Agent OpenClaw

Frontend AI Walk

May 16, 2026 · Industry Insights

Do You Really Need All These AI Coding Frameworks? Tackling Tool‑Learning Anxiety

The article critically examines the rapid rise of AI coding frameworks such as OpenSpec, Superpowers, GStack, GSD, and Agent Skills, exposing their hidden cognitive and token costs, comparing their core philosophies, and offering a principled strategy for selecting the right tool based on task scale.

AI codingSoftware engineeringTDD

0 likes · 15 min read

Do You Really Need All These AI Coding Frameworks? Tackling Tool‑Learning Anxiety

Linyb Geek Road

May 9, 2026 · Artificial Intelligence

Why Overly Long Context Files Reduce AI Agent Success by 3% and Raise Token Cost 20%

The article shows that adding redundant context to AI agents like Claude harms efficiency: each extra 50 lines dilutes attention, lowers task success by about 3 % and inflates token usage by roughly 20 %, because the model’s instruction budget is capped at 150‑200 tokens, so context files must be concise and focused on non‑derivable information.

AGENTS.mdAI agentsCLAUDE.md

0 likes · 17 min read

Why Overly Long Context Files Reduce AI Agent Success by 3% and Raise Token Cost 20%

Old Meng AI Explorer

May 4, 2026 · Artificial Intelligence

8 Essential Claude Code Slash Commands to Double Your Productivity

The article reveals eight high‑frequency slash commands for Claude Code—/init, /clear, /compact, /model, /permissions, /review, /cost, and /memory—explaining when and how to use each, providing concrete code examples, and showing how they can dramatically improve development efficiency and reduce token costs.

AI coding assistantClaude CodeSlash Commands

0 likes · 11 min read

8 Essential Claude Code Slash Commands to Double Your Productivity

Model Perspective

May 4, 2026 · Industry Insights

Why Doubao’s Price Hike Is Inevitable: Cost, Market Saturation, and Monetization Paths

The article analyzes Doubao’s new subscription tiers, reveals that its free service costs billions of yuan daily, explains the market‑saturation dynamics that force a shift to paid models, and compares two monetization routes while forecasting the broader impact on China’s AI industry.

AI industryAI pricingDoubao

0 likes · 8 min read

Why Doubao’s Price Hike Is Inevitable: Cost, Market Saturation, and Monetization Paths

AI Engineering

Apr 13, 2026 · Artificial Intelligence

Why Your Tokens Burn Money Fast and How a Four‑Tier Model Stack Can Cut Costs

The article examines the rapid token consumption problem caused by popular LLM agents, proposes a four‑tier model hierarchy and concrete routing rules, and offers short‑term, long‑term, and budget‑friendly deployment recommendations to reduce expenses while maintaining performance.

LLMMulti‑model deploymentmodel tiering

0 likes · 7 min read

Why Your Tokens Burn Money Fast and How a Four‑Tier Model Stack Can Cut Costs

AndroidPub

Apr 13, 2026 · Artificial Intelligence

Why AI Agents Are Abandoning Model Context Protocol for CLI‑First Toolchains

In early 2026 the AI community witnessed a sharp shift away from Model Context Protocol (MCP) toward CLI‑first approaches, driven by token‑cost inflation, fragmented authentication, and loss of composability, with developers favoring the lightweight, text‑based nature of command‑line tools for building robust agent pipelines.

CLIModel Context Protocolauthentication

0 likes · 15 min read

Why AI Agents Are Abandoning Model Context Protocol for CLI‑First Toolchains

Machine Heart

Apr 7, 2026 · Artificial Intelligence

Why Claude Code burns half its weekly quota in a day – 7 reverse‑engineered bugs

A developer reverse‑engineered Claude Code and identified seven inter‑related bugs—most notably an Extra Usage mode that silently reduces cache TTL to five minutes—causing token‑quota exhaustion up to 1.8× more expensive and triggering a costly “death‑spiral” for CLI users.

AIBug AnalysisCLI

0 likes · 10 min read

Why Claude Code burns half its weekly quota in a day – 7 reverse‑engineered bugs

Machine Heart

Mar 30, 2026 · Artificial Intelligence

Is OpenClaw the Early Linux of AI Agents? A Deep Dive into Its Real Challenges

The article analyzes OpenClaw’s rapid rise, arguing that its impact stems from engineering integration that lowers the usability threshold for AI agents, while highlighting core bottlenecks such as reliability, long‑task execution, token cost, memory architecture, and the need for end‑cloud collaboration.

AI agentsOpenClawagent operating system

0 likes · 24 min read

Is OpenClaw the Early Linux of AI Agents? A Deep Dive into Its Real Challenges

Baidu Geek Talk

Mar 25, 2026 · Artificial Intelligence

Master OpenClaw: Install, Configure, and Scale Multi‑Agent AI Automation

An in‑depth guide walks you through installing OpenClaw, understanding its gateway, channel, agent, tool, and skill architecture, managing token costs, creating multi‑agent workflows, securing API keys, and troubleshooting common issues, empowering developers to build scalable AI‑driven automation.

AutomationInstallationOpenClaw

0 likes · 26 min read

Master OpenClaw: Install, Configure, and Scale Multi‑Agent AI Automation

AI Step-by-Step

Mar 14, 2026 · Cloud Computing

Choosing Between OpenClaw and Platform‑Built xxclaw for Cloud Deployment

The article compares deploying the OpenClaw open‑source core versus using a platform’s built‑in xxclaw, evaluating server ownership, token costs, capability boundaries, and long‑term controllability to help readers decide which cloud route best fits their needs.

OpenClawServer ManagementTencent Cloud

0 likes · 11 min read

Choosing Between OpenClaw and Platform‑Built xxclaw for Cloud Deployment

PMTalk Product Manager Community

Jan 31, 2026 · Industry Insights

Why Token Costs Matter: A Product Manager’s Guide to AI Scaling and Efficiency

The article analyzes how scaling laws still drive AI progress while product focus shifts toward low‑cost inference, explains how reasoning abilities create a positive feedback loop, and shows why token and power consumption have become the decisive factors for competitive AI services.

AI scalingIndustry insightPower Consumption

0 likes · 9 min read

Why Token Costs Matter: A Product Manager’s Guide to AI Scaling and Efficiency

Ximalaya Technology Team

Aug 22, 2023 · Artificial Intelligence

Guidelines and Best Practices for Prompt Engineering with Large Language Models

The guide outlines prompt‑engineering best practices for large language models, distinguishing base and instruction‑tuned LLMs, emphasizing clear, structured, step‑by‑step prompts, handling hallucinations, iterating through idea‑code‑data cycles, applying techniques to summarization, reasoning and expansion, managing token costs, and providing concrete OpenAI API examples.

AIAPI usageLLM

0 likes · 14 min read

Guidelines and Best Practices for Prompt Engineering with Large Language Models