How Rainman Translate Book Cuts Translation API Costs by 90% with 8‑Thread Parallelism

Rainman Translate Book solves the four major pain points of full‑book translation—context loss, slow speed, inconsistent terminology, and costly API retries—by splitting texts, running eight isolated Claude sub‑agents in parallel, enforcing a global glossary, and using SHA‑256 hash checkpoints to enable incremental re‑translation and multi‑format output.

AI Architecture Path
AI Architecture Path
AI Architecture Path
How Rainman Translate Book Cuts Translation API Costs by 90% with 8‑Thread Parallelism

Problem with single‑session book translation

Translating a whole technical book or novel with Claude’s web UI causes context drift, token overflow, manual post‑processing, and loss of API quota when the process crashes.

Comparison: single‑session vs Rainman parallel chunk translation

Context logic : single‑session stacks text, later sections become incoherent; Rainman gives each ~6000‑character chunk an independent context plus a 300‑character read‑only window from adjacent chunks.

Speed : serial processing is very slow; eight Claude sub‑agents run in parallel, giving a 4‑8× speed increase.

Term consistency : no unified constraints in single‑session; Rainman uses a global glossary.json that forces a single translation per term.

Interrupt tolerance : crash forces a full restart; Rainman uses SHA‑256 hash checkpointing to re‑translate only changed or missing chunks, saving ~90 % of API usage.

Integrity : no verification in single‑session; Rainman verifies source‑translation 1:1 with SHA‑256 hashes.

Output formats : plain text only vs automatic generation of HTML with floating TOC, DOCX, EPUB and PDF.

Modification cost : changing a term requires full re‑translation; Rainman re‑translates only affected chunks after glossary edits.

Core mechanisms

Independent sub‑agent parallelism : eight Claude sub‑agents each translate one ~6000‑character chunk. Each agent reads the last 300 characters of the previous chunk and the first 300 of the next chunk as read‑only context.

Global glossary (v2) : five representative chunks are sampled, domain terms are extracted, and glossary.json records original term, aliases, preferred translation, category and confidence. The glossary is injected into every chunk; editing it triggers re‑translation only of impacted chunks.

Hash checkpointing : manifest.json stores SHA‑256 hashes for each source chunk; run_state.json records translation status. After a crash or term change the pipeline re‑hashes and processes only missing or outdated chunks.

Execution pipeline

输入文件 (PDF/DOCX/EPUB)
↓
Calibre ebook-convert → HTMLZ → split into Markdown chunks (~6000 chars)
↓
manifest.json records SHA‑256 of each chunk
↓
Sample 5 chunks → generate global glossary (glossary.json v2)
↓
8 sub‑agents translate in parallel (auto rate‑limit)
↓
Inject glossary + surrounding 300‑char context into each chunk
↓
Produce output chunk + meta, update run_state.json
↓
Hash verification of source and translated chunks
↓
Pandoc merges all chunks → HTML with floating TOC
↓
Calibre exports DOCX / EPUB / PDF

Feature details

Parallel translation: default 8 agents, automatic throttling, one retry per failed chunk.

Glossary v2: supports aliases, categories, frequency stats; flags ambiguous terms for manual disambiguation.

Read‑only adjacent context prevents pronoun and character reference errors across chunks.

Hash‑based resume guarantees 1:1 source‑translation mapping and skips unchanged chunks.

Multi‑format output: output.md, book.html, book.docx, book.epub, book.pdf placed in {BookTitle}_temp directory.

Customizable CLI flags: cover image, temporary directory, export name, page‑number stripping, language selection, etc.

Installation

Required tools: Claude Code CLI, Calibre (provides ebook-convert), Pandoc, Python 3 with pypandoc and beautifulsoup4.

npx: npx skills add deusyu/translate-book -a claude-code -g ClawHub: clawhub install translate-book Git clone:

git clone https://github.com/deusyu/translate-book.git ~/.claude/skills/translate-book

Typical usage

Slash command example:

/translate-book translate /path/to/book.epub to Japanese

After editing glossary.json, re‑run the same command; only affected chunks are re‑translated.

Common error handling

“Calibre ebook-convert not found” – add Calibre to PATH.

“Manifest validation failed” – re‑run the conversion script to rebuild chunk hashes.

Missing source chunk – re‑execute the translation command to regenerate chunks.

Glossary duplicate source – disambiguate terms (e.g., Apple → Apple(tech company)) and reload.

PDF generation failure – ensure Calibre’s PDF export component is installed.

Design principles

Deterministic tasks (hash verification, state tracking, file I/O) are handled by Python; semantic translation is delegated to the LLM.

Single‑writer state files ( glossary.json, run_state.json) avoid file‑locking.

Conservative merge rules: term conflicts are flagged rather than silently overwritten.

Release phases

Glossary v2 & sub‑agent metadata.

Read‑only adjacent context.

Precise local re‑translation on glossary changes.

Cold‑start pre‑heat strategy based on real‑book data.

Project URL

https://github.com/deusyu/translate-book
Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

open-sourceClaudeAI translationglossaryhash checkpointparallel translationtranslate-book
AI Architecture Path
Written by

AI Architecture Path

Focused on AI open-source practice, sharing AI news, tools, technologies, learning resources, and GitHub projects.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.