Artificial Intelligence 14 min read

How a Terminal AI Agent Achieves a 99.82% Cache Hit Rate with DeepSeek API

DeepSeek-Reasonix, a terminal‑based AI coding agent tightly integrated with the DeepSeek API, delivers a 99.82% prefix‑cache hit rate that cuts daily token costs from $61 to $1.38, while offering file editing, command execution, memory, hooks, MCP support, and a preview Tauri desktop client.

Java Companion

May 26, 2026

How a Terminal AI Agent Achieves a 99.82% Cache Hit Rate with DeepSeek API

I recently used Claude Code for a Java audit‑skills project and found the token cost prohibitive; a single day consumed 435 M tokens costing $61. Switching to DeepSeek‑Reasonix dramatically reduced the expense.

A user’s single‑day bill: 435 M input tokens, 99.82% cache hit, actual cost $1.38 . Without caching the same workload would cost $61 .

What DeepSeek‑Reasonix Can Do

Reasonix is a terminal‑run AI coding agent whose backend is locked to DeepSeek, a design choice that enables a 99.82% prefix‑cache hit rate. Its goal is a coding agent whose running cost is low enough to be left unattended.

Code Mode

reasonix code my-project

This launches a full‑featured agent mode with file‑system and shell tools. The model can read, write, and execute commands. Edits are proposed as SEARCH/REPLACE blocks and applied with /apply after user confirmation, leaving the disk unchanged until approved.

Three trust levels control how much the model can act autonomously:

review : all edits and shell commands require manual confirmation (default).

auto : edits are applied automatically; shell commands still need confirmation.

yolo : no confirmations; intended for sandbox use only.

Switch between review and auto with Shift+Tab without re‑typing commands.

Plan Mode

Enabling plan mode ( /plan on) puts the model into read‑only mode: it scans the project, builds a concrete execution plan, and presents it for user approval before any changes are made. /checkpoint After each phase you can snapshot the project state and roll back with /restore <name>.

Memory System

Reasonix provides a two‑level user‑level memory: a global directory ~/.reasonix/memory/global/ and a project‑level memory. Information is injected at the front of each request, becoming part of the prefix so that repeated requests incur no additional token cost.

Memory entries are created by speaking naturally, e.g.:

记住我尝试单元测试用 Vitest 而不是 Jest

The model calls the scaffold_memory tool and generates a memory entry that is applied with /apply. Four memory types are supported: user, feedback, project, and reference.

Skills

A skill is a Markdown file with front‑matter describing its purpose and execution command. Example:

/skill new security-audit   # generate template in .reasonix/skills/</code>
<code>/skill security-audit       # invoke directly

Skills run either inline (in the current context) or subagent (in an isolated sub‑agent, useful for repetitive checks).

Reasonix natively supports Claude Code‑style skill files placed in .claude/skills/ without format conversion.

Hooks

Hooks are lifecycle callbacks that can run custom shell commands at four points:

PreToolUse : runs before a tool; returning exit code 2 aborts the call.

PostToolUse : runs after a tool; useful for linting or logging.

UserPromptSubmit : runs after user input but before model processing; can also abort with exit code 2.

Stop : runs when the session ends for cleanup.

Hooks are configured in settings.json, with project‑level settings overriding global ones. Example:

{
  "hooks": {
    "PostToolUse": [
      {"command": "npx prettier --write . 2>/dev/null"}
    ]
  }
}

This runs Prettier automatically after each model‑generated edit.

MCP Support

Reasonix can connect to MCP servers via three transports: stdio, SSE, and the new Streamable HTTP (as of March 2025). Configuration lives in ~/.reasonix/config.json:

{
  "mcpServers": {
    "github": {
      "command": "npx",
      "args": ["-y", "@modelcontextprotocol/server-github"],
      "env": {"GITHUB_TOKEN": "ghp_***"}
    }
  }
}

After setting this, the model can call the GitHub API directly. Similar entries can be added for PostgreSQL, Jira, local files, or custom APIs. The interactive hub is opened with /mcp, and services can be hot‑reconnected with /mcp reconnect <name>.

Built‑in Web Search

Enabling "search": true in config.json adds web_search and web_fetch tools. The default engine is Mojeek (no API key needed). Users can run a local SearXNG instance:

podman run -d --name searxng -p 8080:8080 docker.io/searxng/searxng
/search-engine searxng

Chinese users may use Metaso, which offers 100 free searches per day.

Semantic Local Index

The reasonix index command builds a semantic vector index of the project, allowing the model to search code semantically instead of recursively scanning directories.

Embedding engines can be local Ollama ( nomic-embed-text) or any OpenAI‑compatible API. Example configuration:

{
  "semantic": {
    "provider": "ollama",
    "ollama": {
      "baseUrl": "http://localhost:11434",
      "model": "nomic-embed-text"
    }
  }
}

This is especially effective for large codebases, enabling the model to pinpoint exact locations.

Session Persistence

Each workspace session is automatically saved. Subsequent runs can continue with --continue, preserving the prefix cache.

reasonix code --continue          # resume latest session
reasonix code --session my-task   # attach to a named session
reasonix code --budget 5.00       # set USD budget for the session

The --budget flag warns at 80% usage and aborts at 100%, providing cost control. Sessions can be listed with reasonix sessions and pruned with reasonix prune-sessions --days 7.

QQ Channel

Running /qq connect opens a persistent session that forwards messages between the terminal and a QQ channel, allowing remote control without a computer.

Tauri Desktop Client (Preview)

A native Tauri desktop client bundles a Node runtime, offering a multi‑tab UI, a side panel showing files read/modified in the current session, and live cost, cache‑hit, and token counters. The same DeepSeek API key and ~/.reasonix configuration are shared between CLI and desktop.

Installation notes: the preview build is unsigned; on macOS run xattr -dr com.apple.quarantine /Applications/Reasonix.app. Windows users dismiss the SmartScreen warning; Linux users can install the .deb or AppImage directly.

Installation

npm install -g reasonix
reasonix code my-project   # first run prompts for DeepSeek API key, then persists

For project‑local use: cd my-project && npx reasonix code The short alias dsnix works the same after npm install -g dsnix. Node ≥ 22 is required; macOS, Linux, and Windows are supported.

Comparison

Claude Code uses Anthropic’s closed‑source backend with premium pricing; it excels at complex reasoning but is costly for routine coding.

Aider is open‑source and model‑agnostic via OpenRouter, but its design prevents deep binding to DeepSeek’s prefix cache, yielding only 30‑60% cache‑hit rates and higher costs.

In practice, DeepSeek‑Reasonix handled extensive Java code scanning, memory‑operation analysis, and PoC skeleton generation with stable cache hits after a few rounds, keeping token spend low.

Measured Conclusion

DeepSeek‑Reasonix substantially reduces cost, though it is not yet ready to replace mainstream coding assistants for all tasks; it shines on simpler, repetitive jobs where its cheap, high‑cache operation yields noticeable savings.

https://github.com/esengine/DeepSeek-Reasonix

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

cost optimization DeepSeek AI coding agent prefix cache Reasonix terminal tool

Written by

Java Companion

A highly professional Java public account

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.