Cut Token Usage by Up to 80% in Claude Code, Codex, and OpenCode
The article explains how to dramatically reduce token consumption in Claude Code, GitHub Copilot's Codex, and the open‑source OpenCode by tightly controlling input, trimming context, filtering files, leveraging tools, caching, and model selection, offering concrete commands, configuration files, and a ten‑step checklist that can cut usage by up to 80%.
Token consumption fundamentals
Billing formula: Total cost = Input Tokens × input price + Output Tokens × output price.
Input Tokens (70‑90%) : commands, conversation history, project files, tool output, system prompts.
Output Tokens (10‑30%) : code, explanations, logs returned by the model.
Maximum black hole : automatic project file reading can consume up to 80% of Input per interaction.
Claude Code token‑saving techniques
1. File filtering with .claudeignore
Create a .claudeignore file in the project root; syntax mirrors .gitignore. Example patterns:
# Dependencies & builds (largest black holes)
node_modules/
dist/
build/
.next/
__pycache/
# Lock files / logs
*.lock
package-lock.json
*.log
# VCS / IDE
.git/
.idea/
.vscode/
# Assets / caches
*.png
*.jpg
*.svg
*.ico
.cache/
coverage/Effect: a single interaction drops from 150 k tokens to 60 k (≈60% reduction).
2. Context compression with /compact
Manual: run /compact at logical checkpoints (e.g., after completing a feature).
Command‑level: /compact keeps code changes and file paths, discarding analysis.
Automatic: enable via /config Auto-compact enabled. Effect: 25 k → 3 k tokens (88% saved).
3. Documentation‑driven approach with CLAUDE.md
Place a CLAUDE.md at the project root describing overview, tech stack, directory layout, and development commands. Example:
# Project Overview
Next.js 14 + TypeScript + Prisma + PostgreSQL SaaS
# Directory Structure
src/app/ # App Router
src/components/ # Components
src/lib/ # Utilities
src/server/ # Server side
# Development Commands
pnpm dev
pnpm buildEffect: reduces exploratory cat/find/grep operations, saving >30% unnecessary tokens.
4. Memory management with /memory
Store fixed information:
/memory 项目用 Next.js 14 + TypeScript,接口规范见 docs/api.mdView stored items: /memory list Delete a key: /memory delete [key] Effect: avoids repeated pasting, saving >40% repeated input.
5. Plan Mode (Shift+Tab)
Ask the AI to produce an execution plan first, confirm it, then run.
Effect: reduces trial‑and‑error, saving >20% useless tokens.
6. Output trimming
Enable tool‑output trimming via /config to strip ANSI colors, progress bars, empty lines.
Truncate long logs, keeping only error stacks and failure cases.
Effect: npm test output shrinks from 25 k to 2.5 k tokens (90% saved).
7. Model switching with /model
Simple tasks (syntax, small functions): /model haiku (lowest price).
Complex tasks (architecture, multi‑file): /model sonnet.
Ultra‑complex: /model opus (use only when necessary).
Effect: task cost reduced 30%–80%.
Codex (GitHub Copilot) token‑saving techniques
1. IDE configuration: limit max file context
In VS Code settings set GitHub Copilot → Max File Context to 3–5 files.
Effect: Input reduced >50%.
2. Prompt shortening with comments
Bad prompt:
Help me write a backend login API with Node.js + Express, JWT, password hashing, error handlingGood prompt: // Node.js Express login API JWT bcrypt Effect: Input reduced >40%.
3. Disable unnecessary features
Turn off auto‑completion and real‑time suggestions, enable only when needed.
Disable multi‑file indexing except during refactoring.
Effect: reduces background scanning token consumption.
4. File‑by‑file development
Develop one file per function, avoid large cross‑file logic; manually copy snippets when needed.
Effect: context size reduced >60%.
OpenCode (self‑hosted) token‑saving techniques
1. Precise context limits via config.json
{
"model": {
"name": "deepseek-v3",
"input_limit": 128000,
"output_limit": 80000
}
}Effect: fully utilizes context, avoids automatic truncation and duplicate requests, saving >30%.
2. File filtering with .opencodeignore
Same pattern list as .claudeignore to exclude dependencies, build artifacts, logs, and resource files.
3. Manual history management
Periodically run /clear to reset context and prevent multi‑task buildup.
Start a new session for each distinct function to avoid mixing histories.
Effect: avoids history bloat, saving >50% useless context.
4. Low‑cost model selection
Simple tasks: Qwen 7B, Llama 3 8B (local or cheap API).
Complex tasks: DeepSeek V3, Qwen Max (switch as needed).
Effect: unit price reduced 70%–95%.
Practical 10‑step token‑saving checklist
Create .claudeignore / .opencodeignore in the project root and copy the template patterns.
Add a CLAUDE.md file describing the tech stack, directory layout, and commands.
Enable automatic compression ( /config Auto-compact for Claude).
For long conversations, manually run /compact at logical checkpoints.
Store project configuration and conventions with /memory to avoid repeated input.
Use Plan Mode (Shift+Tab) for complex tasks, planning before execution.
Switch models per task: simple tasks use /model haiku, complex tasks use /model sonnet.
Disable unnecessary automatic features such as real‑time completion and full‑project scanning.
Separate development into distinct sessions or files to prevent history accumulation.
Regularly check token usage ( /usage) to locate black holes and adjust settings.
Key reminders
Input is the core : prioritize optimizing file reads, context size, and prompt length.
Prefer over‑exclusion : excluding files is cheaper than scanning them.
Timely cleanup : compress or clear long conversations and multi‑task histories to avoid context bloat.
Model matching : choose the appropriate model tier for each task instead of defaulting to the most powerful one.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
IoT Full-Stack Technology
Dedicated to sharing IoT cloud services, embedded systems, and mobile client technology, with no spam ads.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
