How Lalamove Built a Multi‑Agent AI Framework to Cut Translation Costs by 90%
Lalamove tackled the massive multilingual translation workload of its global app and website by designing a three‑layer, multi‑agent AI framework that combines specialized translation, quality scoring, and compliance agents, achieving rapid, native‑like output while slashing costs and turnaround time.
Background
More than 1,100 languages exist worldwide, yet most people master only two or three. Traditional neural machine translation tools like DeepL handle about 30 languages, while mainstream large language models (LLMs) cover 50+, and specialized models such as Meta MMS claim support for over 1,000 languages.
Lalamove Translation Challenge
Lalamove, a cross‑border logistics platform, must translate tens of thousands of UI strings, marketing copy, and documentation for each new market. The volume, mixed content types, and need for linguistic consistency make manual translation slow, costly, and error‑prone.
Translation Goals
Accuracy, Fluency, Elegance ("Xin Da Ya") – ensure translations are correct, idiomatic, and aesthetically appropriate.
Objective Quality Evaluation – develop a standardized, automated way to assess translation quality.
Safety and Compliance – prevent hallucinations and filter politically or culturally sensitive content.
Why a Multi‑Agent Framework?
Simple prompt engineering cannot meet all three goals simultaneously. Lalamove therefore designed a coordinated set of agents that treat translation as an end‑to‑end workflow rather than isolated text processing.
Three‑Layer Architecture
The system consists of:
Application Layer – interfaces with business users and quality‑control reviewers.
Core Layer – runs on Lalamove’s internal "Wukong" LLM platform, providing the core translation, scoring, and compliance functions.
Data Layer – stores terminology, reference translations, and evaluation metrics for reuse.
Core Agents
Translation Agent (Senior Translator) – generates precise language conversions.
Translation Quality Scoring Agent (Professional Proofreader) – assigns multi‑dimensional scores to filter low‑quality outputs.
Sensitive‑Info Detection Agent (Compliance Reviewer) – removes political, religious, or culturally taboo material.
Key Benefits
Speed – LLM handles the bulk of work; human review is limited to low‑scoring samples, reducing translation cycles from months to days.
Native‑Like Expression – contextual prompts and domain‑specific knowledge bases produce idiomatic output.
Cost Reduction – AI performs roughly 90% of the work, cutting labor expenses dramatically.
Technical Deep Dive: Translation Agent
The agent combines three techniques:
Domain Terminology Knowledge Base – a lightweight repository of logistics‑specific jargon that forces the LLM to prioritize standardized terms.
Few‑Shot Reference Translations – high‑quality human translations from other languages are supplied as examples, helping the model lock onto the correct sense of ambiguous words (e.g., "order").
Context Injection – the agent feeds UI context (screen, button purpose, surrounding text) to the model, enabling situational translation rather than isolated sentence conversion.
Evaluation Agent
Quality assessment blends automatic metrics with human oversight:
Semantic similarity scores (COMET, BERTScore) and lexical overlap metrics (BLEU) are computed for each output.
A threshold‑based filter flags low‑scoring samples for human post‑editing, ensuring that human effort focuses on the hardest cases.
Sensitive‑Info Detection Agent
Two‑stage compliance checks are applied:
General Safety Scan – automatically blocks content containing violence, pornography, or hate speech.
Region‑Specific Compliance – filters political, religious, or ethnic sensitivities tailored to each target market.
Any sample flagged by this agent must be reviewed by local business owners before release.
Conclusion
By integrating translation, evaluation, and compliance agents into a unified framework, Lalamove achieved a 90% cost cut and reduced translation turnaround from months to days while maintaining high linguistic quality and regulatory safety. The approach provides a reusable vertical‑AI template for other business domains.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
