Design and Implementation of an Automated Logistics QA Bot Using Retrieval, Rerank, and Data Augmentation Techniques
This article describes a low‑cost, privacy‑preserving chatbot for logistics that combines data cleaning, large‑model‑based data augmentation, BM25 and vector retrieval, a DNN rerank model, and LLM‑driven answer rewriting to deliver accurate, compliant automated responses.
Business Background In building a private‑domain logistics system, multiple WeChat groups require an automatic reply robot to answer user queries with minimal cost while ensuring data privacy, answer accuracy, and avoiding hallucinations that could cause legal risks.
Technical Solution Overview The project uses roughly 200 existing QA pairs, applying a three‑stage pipeline: recall, precise ranking, and answer rewriting. The goal is to match user queries to the most similar knowledge in the QA database and return a refined response.
(1) Data Cleaning The raw Excel data from the business side is unstructured; it is transformed into a standardized {"query": "...", "answer": "..."} format to facilitate DNN training, enabling easy generation of positive and negative samples.
(2) Data Augmentation Because the original dataset is small and user queries are diverse, large‑model generation is used to rewrite and expand the data. An example prompt template is provided:
zh_prompt_template = """
如下三个反引号中是{product}的相关知识信息, 请基于这部分知识信息自动生成{question_num}个问题以及对应答案
```
{knowledge}
```
要求尽可能详细全面, 并且遵循如下规则:
1. 生成的内容不要超出反引号中信息的范围
2. 问题部分需要以"Question:"开始
3. 答案部分需要以"Answer:"开始
"""The augmented data is then split into training and test sets, with careful consideration of potential over‑fitting due to similar generated QA pairs.
(3) Model Training The system consists of two parts: recall and rerank. Recall combines traditional BM25 inverted‑index retrieval with vector‑based retrieval. The rerank stage uses a higher‑complexity DNN model to rescore candidates before passing the top answer as background knowledge to a large language model for final rewriting.
BM25 Recall Traditional keyword‑based retrieval offers fast, interpretable results but lacks semantic understanding.
Vector Recall Embedding‑based retrieval (e.g., Sentence‑BERT) captures semantic similarity, complementing BM25. The model computes query and answer embeddings and uses cosine similarity or concatenation for downstream scoring.
Rerank Model A DNN reranker refines the multi‑source recall results, handling the different scoring scales and providing a final ranking suitable for a small candidate set.
Output Stage To avoid "answer‑not‑found" or stiff responses, the selected QA pair and the user's original query are fed back into a large language model for second‑stage rewriting, producing a natural final answer. A fallback "refuse to answer" logic is also implemented based on similarity thresholds.
(4) Evaluation Demonstrations of vector recall before and after fine‑tuning show clearer separation of positive and negative scores, indicating improved model performance.
Online Deployment Due to the complexity of the pipeline, the solution is packaged as a container image and deployed via the internal platform rather than as a TorchScript or ONNX model.
References The approach draws on Amazon's public algorithmic solutions and includes links to related AWS blog posts and the SBERT paper (https://arxiv.org/pdf/1908.10084).
JD Tech Talk
Official JD Tech public account delivering best practices and technology innovation.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.