Artificial Intelligence 21 min read

Multi‑Agent Architecture for an E‑Commerce Business Assistant: Design, Planning, Evaluation, and Sample Generation

The document describes the evolution, design principles, key technologies, online inference workflow, evaluation methods, and sample‑generation techniques of a large‑language‑model‑based multi‑agent system that powers a 24/7 e‑commerce merchant assistant, highlighting its benefits, challenges, and future work.

JD Tech Talk
JD Tech Talk
JD Tech Talk
Multi‑Agent Architecture for an E‑Commerce Business Assistant: Design, Planning, Evaluation, and Sample Generation

Introduction – The merchant assistant is built on a large language model (LLM) driven multi‑agent system that mimics the collaborative workflow of real‑world e‑commerce teams. It offers 24/7 business support through natural‑language interaction and has evolved through three stages, culminating in a master‑plus‑sub‑agents architecture that significantly improves accuracy.

From Real‑World Business to Multi‑Agent Space – The system maps multiple real‑world merchant roles to agents, providing a generic, open host for capabilities such as sales forecasting, marketing, pricing, and keyword recommendation. Tools (agents, APIs) can be added at any development stage.

2.1 Agent Construction – ReAct Paradigm with Multi‑Model Integration – Four model types are combined: LLM for goal extraction and validation, Embedding for fast tool matching, Tools DAG for multi‑path reverse reasoning, and Operations Research optimization for planning efficiency. ReAct enables dynamic planning updates after each execution step.

2.2 Multi‑Agent Online Inference – A master agent decomposes complex tasks into sub‑agents that perform hierarchical dynamic planning and distributed scheduling. Communication follows a standard protocol, supporting multi‑step coordination and global chain‑of‑thought planning.

2.2.1 Technical Features – Task‑layered planning, distributed collaboration, and a standardized communication protocol ensure efficient cooperation among agents.

2.2.2 Demonstration – A video showcases the end‑to‑end online inference process of the assistant.

2.3 Full‑Chain ReAct Evaluation – System‑wide evaluation aggregates weighted scores of each agent, while local evaluation uses a Reward Model to assess thought/action/observation quality, identifying bottlenecks.

2.4 Reward Model Variants – Supports custom business rules, leverages existing SOTA LLMs, and allows training of dedicated reward models. Example prompt for evaluating intent‑summarization quality is shown below:

输入总结模型的目标是针对用户历史的会话记录与本轮的提问分析其具体意图,作为Master Agent的思考的核心环节,需要对其意图总结效果进行评价。

2.5 LLM Offline Sample Enhancement – Standardized business data are used to automatically generate and expand training samples for LLMs, while online inference data are continuously labeled via reward‑model strategies, enriching the sample pool.

Challenges & Benefits – The architecture improves planning efficiency, reduces inference cost, enhances stability, mitigates LLM hallucination, lowers sample engineering effort, and enables rapid iteration. Remaining issues include longer response times for complex queries and error accumulation in chained reasoning, which are being addressed through multi‑agent joint learning.

References – The document lists the full interaction protocol (request, planning, reasoning, tool calls, logging, and response) and provides QR‑code links for a technical community.

e-commerceLLMreactReward Modelmulti-agentOnline InferenceAI planning
JD Tech Talk
Written by

JD Tech Talk

Official JD Tech public account delivering best practices and technology innovation.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.