Yunli XiaoZhi: An AI‑Powered Intelligent Assistant for Knowledge Q&A and Data Analysis in Logistics Operations
The document describes the design, implementation, and operational results of Yunli XiaoZhi, an AI‑driven portable knowledge‑base and data‑analysis chatbot that consolidates SOPs, manuals, and real‑time information for logistics staff, using LangChain‑based RAG, vector databases, and large‑model prompting to improve query efficiency, proactive alerts, and reporting across multiple user groups.
Background & Problems
Operational staff, frontline workers, and managers face fragmented knowledge sources, time‑consuming document retrieval, repetitive Q&A, and lack of unified reporting tools, leading to low efficiency and poor user experience, especially on mobile devices.
Measures & Goals
Build "Yunli XiaoZhi", a portable AI‑enabled chatbot that integrates knowledge Q&A and data analysis, covering SOPs, system issues, real‑time queries (weather, safety), and report generation, aiming to reduce knowledge‑acquisition cost and improve incident management.
Key Functions
Intelligent Q&A: multi‑turn dialogue with retrieval‑augmented generation (RAG) to answer common operational questions and retrieve data.
Proactive Alerts: push notifications, voice messages, and scheduled reports to designated users or groups.
Implementation Details
1. Knowledge Q&A
The system uses the open‑source LangChain framework and internal large‑model APIs to build a RAG chatbot. Two knowledge bases are created: a QA‑pair store and a document store (PDF, DOCX, PPTX). Documents are parsed, tables are extracted, and all texts are vectorized and stored in JD’s Vearch vector database.
Example of PDF parsing result (kept as code):
{
"metadata": {"footers": [], "headers": [], "catalogs": []},
"chapters": {"1": "[CHAPTER_ROOT]", "1.1": "第一条 xxx", "1.2": "第二条 xxxx", "1.3": "第三条 xxxx"},
"context": [{"text": "JDLxxxx规定", "type": "text", "pid": 1, "sid": 1, "metadata": {"section_range": []}, "cid": "1"}, ...]
}Table block example (kept as code):
{
"text": [
[[0,0,1,1], "名称"],
[[0,1,1,2], "尺寸"],
...
],
"type": "table",
"pid": 89,
"sid": 111,
"metadata": {"section_range": []},
"cid": "1.8"
}Question condensation chain (code snippet):
from langchain import PromptTemplate
from langchain.chains import LLMChain
from langchain.chat_models import ChatOpenAI
def get_condense_question_chain(self):
"""精简问题链"""
CONDENSE_QUESTION_PROMPT = PromptTemplate.from_template(
"""给定历史对话和一个后续问题,将后续问题改写为一个标准问题,用其原始语言,确保避免使用任何不清晰的代词。\n历史对话:{chat_history}\n后续输入: {question}\n标准问题:"""
)
condense_question_chain = LLMChain(
llm=ChatOpenAI(model="", temperature="", openai_api_key="", openai_api_base=""),
prompt=CONDENSE_QUESTION_PROMPT,
)
return condense_question_chain2. Data Analysis
A NoETL derived logical model asset is defined to generate reusable metric definitions across time grains and dimensions, stored as JSON metadata. A semantic knowledge graph is built from model metadata, enabling natural‑language queries to be mapped to logical models and automatically generate SQL.
Sample logical model metadata (code):
{
"uid": "742250d1dd9f457aa",
"name": "离线_低装载线路占比_日_3",
"nodes": [{"id": "98579cdb14b44423ace0", "data": {"viewUid": "e246257e141e4fe78", "viewSql": "SELECT dt, trans_type_new_name AS trans_type_name , ..."}, "type": "fact"}],
"where": "trans_type_name <> '全部' AND ...",
"measures": [{"id": 99, "names": ["低装载线路占比"], "sql": "SUM(low_loading_plink_cnt)/SUM(plink_cnt)", "type": "float", "format": "percentage", "sort": 1}],
"dimensions": [{"id": 1, "names": ["区域"], "field": "transport_org_name", "type": "str", "format": "text", "description": "区域"}, ...]
}The AI model generates SQL from natural language, executes it, and provides analytical insights, reducing reliance on manual data analysts.
Capability Demonstration
Feature 1: Metric Query
Users can ask for metrics (e.g., loading rate, on‑time rate) with time, region, and chart type, receiving instant visualizations via the chatbot.
Feature 2: Knowledge Q&A
Provides instant answers to SOPs, system issues, daily reports, and links to documents, dramatically shortening information‑retrieval time.
Feature 3: Trajectory Query
Allows operators to retrieve vehicle trajectory by TW number in a single step, cutting query time from 2‑3 minutes to under one minute.
Feature 4: Driving‑License Image Query
Enables retrieval of driving‑license images by license plate, improving incident verification efficiency.
Feature 5: Report Push
Supports scheduled and alert‑based report pushes to groups in JD‑ME, with configurable rules.
Feature 6: Information Push
Automates weekly announcements and survey distribution.
Results
Active users per week: 50‑100; total consultations >500; coverage across 154 organizations and 68 roles. Trajectory queries reduced from 2‑3 minutes to <1 minute, with 1,230 queries handled. Driving‑license image queries reduced from ~10 minutes to <1 minute. Success rates: data‑analysis queries ~70%, knowledge‑Q&A ~50% after user onboarding.
Conclusion & Future Plan
Yunli XiaoZhi combines large‑model AI with RAG and semantic knowledge graphs to deliver a portable, unified knowledge‑and‑data assistant for logistics operations. Q2 focuses on delivering interactive data analysis, knowledge search, and report push, while future work will enhance retrieval stability, scale the solution, and expand AI‑driven functionalities.
JD Tech Talk
Official JD Tech public account delivering best practices and technology innovation.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.