Artificial Intelligence 19 min read

Yunli XiaoZhi: An AI‑Powered Intelligent Assistant for Knowledge Q&A and Data Analysis in Logistics Operations

The document describes the design, implementation, and operational results of Yunli XiaoZhi, an AI‑driven portable knowledge‑base and data‑analysis chatbot that consolidates SOPs, manuals, and real‑time information for logistics staff, using LangChain‑based RAG, vector databases, and large‑model prompting to improve query efficiency, proactive alerts, and reporting across multiple user groups.

JD Tech Talk
JD Tech Talk
JD Tech Talk
Yunli XiaoZhi: An AI‑Powered Intelligent Assistant for Knowledge Q&A and Data Analysis in Logistics Operations

Background & Problems

Operational staff, frontline workers, and managers face fragmented knowledge sources, time‑consuming document retrieval, repetitive Q&A, and lack of unified reporting tools, leading to low efficiency and poor user experience, especially on mobile devices.

Measures & Goals

Build "Yunli XiaoZhi", a portable AI‑enabled chatbot that integrates knowledge Q&A and data analysis, covering SOPs, system issues, real‑time queries (weather, safety), and report generation, aiming to reduce knowledge‑acquisition cost and improve incident management.

Key Functions

Intelligent Q&A: multi‑turn dialogue with retrieval‑augmented generation (RAG) to answer common operational questions and retrieve data.

Proactive Alerts: push notifications, voice messages, and scheduled reports to designated users or groups.

Implementation Details

1. Knowledge Q&A

The system uses the open‑source LangChain framework and internal large‑model APIs to build a RAG chatbot. Two knowledge bases are created: a QA‑pair store and a document store (PDF, DOCX, PPTX). Documents are parsed, tables are extracted, and all texts are vectorized and stored in JD’s Vearch vector database.

Example of PDF parsing result (kept as code):

{
    "metadata": {"footers": [], "headers": [], "catalogs": []},
    "chapters": {"1": "[CHAPTER_ROOT]", "1.1": "第一条 xxx", "1.2": "第二条 xxxx", "1.3": "第三条 xxxx"},
    "context": [{"text": "JDLxxxx规定", "type": "text", "pid": 1, "sid": 1, "metadata": {"section_range": []}, "cid": "1"}, ...]
}

Table block example (kept as code):

{
    "text": [
        [[0,0,1,1], "名称"],
        [[0,1,1,2], "尺寸"],
        ...
    ],
    "type": "table",
    "pid": 89,
    "sid": 111,
    "metadata": {"section_range": []},
    "cid": "1.8"
}

Question condensation chain (code snippet):

from langchain import PromptTemplate
from langchain.chains import LLMChain
from langchain.chat_models import ChatOpenAI

def get_condense_question_chain(self):
    """精简问题链"""
    CONDENSE_QUESTION_PROMPT = PromptTemplate.from_template(
        """给定历史对话和一个后续问题,将后续问题改写为一个标准问题,用其原始语言,确保避免使用任何不清晰的代词。\n历史对话:{chat_history}\n后续输入: {question}\n标准问题:"""
    )
    condense_question_chain = LLMChain(
        llm=ChatOpenAI(model="", temperature="", openai_api_key="", openai_api_base=""),
        prompt=CONDENSE_QUESTION_PROMPT,
    )
    return condense_question_chain

2. Data Analysis

A NoETL derived logical model asset is defined to generate reusable metric definitions across time grains and dimensions, stored as JSON metadata. A semantic knowledge graph is built from model metadata, enabling natural‑language queries to be mapped to logical models and automatically generate SQL.

Sample logical model metadata (code):

{
    "uid": "742250d1dd9f457aa",
    "name": "离线_低装载线路占比_日_3",
    "nodes": [{"id": "98579cdb14b44423ace0", "data": {"viewUid": "e246257e141e4fe78", "viewSql": "SELECT dt, trans_type_new_name AS trans_type_name , ..."}, "type": "fact"}],
    "where": "trans_type_name <> '全部' AND ...",
    "measures": [{"id": 99, "names": ["低装载线路占比"], "sql": "SUM(low_loading_plink_cnt)/SUM(plink_cnt)", "type": "float", "format": "percentage", "sort": 1}],
    "dimensions": [{"id": 1, "names": ["区域"], "field": "transport_org_name", "type": "str", "format": "text", "description": "区域"}, ...]
}

The AI model generates SQL from natural language, executes it, and provides analytical insights, reducing reliance on manual data analysts.

Capability Demonstration

Feature 1: Metric Query

Users can ask for metrics (e.g., loading rate, on‑time rate) with time, region, and chart type, receiving instant visualizations via the chatbot.

Feature 2: Knowledge Q&A

Provides instant answers to SOPs, system issues, daily reports, and links to documents, dramatically shortening information‑retrieval time.

Feature 3: Trajectory Query

Allows operators to retrieve vehicle trajectory by TW number in a single step, cutting query time from 2‑3 minutes to under one minute.

Feature 4: Driving‑License Image Query

Enables retrieval of driving‑license images by license plate, improving incident verification efficiency.

Feature 5: Report Push

Supports scheduled and alert‑based report pushes to groups in JD‑ME, with configurable rules.

Feature 6: Information Push

Automates weekly announcements and survey distribution.

Results

Active users per week: 50‑100; total consultations >500; coverage across 154 organizations and 68 roles. Trajectory queries reduced from 2‑3 minutes to <1 minute, with 1,230 queries handled. Driving‑license image queries reduced from ~10 minutes to <1 minute. Success rates: data‑analysis queries ~70%, knowledge‑Q&A ~50% after user onboarding.

Conclusion & Future Plan

Yunli XiaoZhi combines large‑model AI with RAG and semantic knowledge graphs to deliver a portable, unified knowledge‑and‑data assistant for logistics operations. Q2 focuses on delivering interactive data analysis, knowledge search, and report push, while future work will enhance retrieval stability, scale the solution, and expand AI‑driven functionalities.

AIRAGdata analysisLogisticsKnowledge BaseChatbot
JD Tech Talk
Written by

JD Tech Talk

Official JD Tech public account delivering best practices and technology innovation.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.