How Alibaba’s Logistics AI Overcame B2B Large Model Challenges
Alibaba’s logistics AI team shares their year‑long journey building a vertical‑domain large language model for logistics, detailing model alignment, Text2API, RAG, SFT techniques, challenges like accuracy and knowledge‑base maintenance, and showcasing real‑world applications such as chatbots, DingTalk assistants, and custom AI assistants.
Background
Alibaba’s logistics technology team built a vertical‑domain large language model called "Logistics AI" to handle consumer logistics queries, commercial Q&A, and internal ticket support. They also created a Logistics AI platform that lets users define a scenario and launch a custom logistics assistant in 1–2 minutes.
Characteristics of Vertical‑Domain Large Models
Vertical‑domain LLMs use a general base model fine‑tuned with domain‑specific knowledge, offering higher expertise, better output quality, and superior task performance. B2B scenarios introduce challenges such as strict accuracy requirements, frequent knowledge‑base updates, and limited applicability across unrelated domains.
Domain expertise : better understanding of industry terminology and context.
High‑quality output : optimized for the specific field.
Task‑specific performance : superior results on domain‑focused tasks.
1. Alignment Enhancement (BPO)
The team applied Black‑Box Prompt Optimization (BPO) to improve question understanding and answer accuracy. The process involves:
Give a base model an init instruction and generate a good answer (A1) and a bad answer (A1').
Use GPT‑4 to compare good and bad answers with the question, producing a tuned instruction.
Train a seq2seq model that maps a question to the tuned instruction.
Deploy the seq2seq model so every user query is prefixed with the generated instruction before being fed to the LLM.
This alignment step raised answer accuracy by 1.8%.
2. Text2API
Logistics AI acts as an agent that calls over 1,000 internal APIs. Initial attempts with LangChain’s React framework suffered from hallucinated parameters and long call chains. Switching to the Reflexion framework introduced self‑reflection and memory modules, improving API selection accuracy by 4%.
3. Retrieval‑Augmented Generation (RAG)
RAG combines proprietary knowledge‑base retrieval with LLM generation. The team tackled diverse source materials (PDF contracts, flowcharts, screenshots) by:
Using ChatGPT to describe flowcharts, then manually reviewing ~1,000 results for SFT.
Employing both transformer‑based and rule‑based text splitting, carefully choosing chunk sizes.
Clustering chunks semantically and recursively summarizing them to form a hierarchical knowledge tree.
4. Supervised Fine‑Tuning (SFT)
They collected tens of thousands of labeled logistics scenarios, evaluated many open‑source base models and fine‑tuning methods using embedding similarity, human scoring, and GPT‑4 scoring. Public datasets (COIG‑CQIA, alpaca‑gpt4‑data‑cn) were added to mitigate domain‑specific degradation. Incorporating ORPO (preference optimization) into SFT yielded a 5.2% overall answer quality boost.
Deployed Projects and Results
Logistics XiaoMi : A multimodal assistant that processes images and text, achieving 1.7 s average latency, <1% failure rate, and high accuracy in attribute extraction and intent recognition.
DingTalk Logistics Robot : Provides real‑time answers to merchants in over 20 DingTalk groups, serving more than 10,000 merchants and reducing support costs.
QianNiu Logistics Backend (in development) : Planned integration of the assistant into the QianNiu logistics tab for merchant queries.
Logistics AI Product
The product enables users to create a custom logistics assistant in three steps: upload a knowledge base (documents, PDFs, PPTs), bind required APIs (e.g., abnormal logistics detection, shipping status), and import the assistant into DingTalk or via API.
Typical use cases include viewing product logistics attributes and tracking order trajectories with anomaly alerts.
{
'问题': '用户遇到的核心问题是手工制作的内裤存在质量问题,如长度不一、穿着后裂开,且用户已经清洗无法退货',
'诉求': '用户希望申请仅退款,因为产品有质量问题且清洗后无法退货',
'是否协商一致': '否'
}Conclusion & Acknowledgements
The logistics AI team reflects on a year of experimentation, noting breakthroughs in model alignment, Text2API, RAG, and SFT, while acknowledging remaining challenges and inviting collaboration across teams.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Alibaba Cloud Developer
Alibaba's official tech channel, featuring all of its technology innovations.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
