Industry Insights 13 min read

How We Built a Scalable Smart Customer Service System for an Activity Platform

This article details the end‑to‑end design, implementation, and operational results of a smart customer‑service platform that automates FAQ capture, leverages both Elasticsearch and LLM‑based models, and provides a low‑code, multi‑team backend for rapid issue resolution.

Architect

Jan 27, 2024

How We Built a Scalable Smart Customer Service System for an Activity Platform

Problem

Operators of an activity platform raised ad‑hoc questions in a “fire‑fighting” group; on‑call developers recorded issues manually in Excel each week. Pain points: (1) FAQ not auto‑collected, (2) routing required deep domain knowledge, (3) noisy group messages caused missed tickets.

Solution Overview

Built an intelligent customer‑service system with five modules: dialog interface, conversation state machine, data‑source model, statistics & reporting backend, integration configuration. Architecture follows domain‑driven microservices, minimal viable architecture (MVA), mixing synchronous and asynchronous flows.

Dialog Interface

User research showed operators prefer native WeChat Work (企微) interaction. Service‑account lacks callbacks and cannot coexist with human agents, so the “application account” mode was chosen as entry point.

Message Reception & Parsing

Bound a callback API to the application account; every user action (including proactive messages) is received, decrypted, and processed. In test environment, external requests to the UAT domain were blocked; a proxy layer forwarded external traffic to the internal UAT host and added a department flag for data isolation.

Conversation State Machine

Lifecycle: opened → in‑progress (auto‑transfer or manual) → closed. A new session is created when an operator sends a message. The state machine handles session creation, information collection, FAQ matching, answer generation, one‑click group creation, and termination. It also maintains a delayed‑message queue to re‑confirm after long inactivity and automatically closes stale sessions.

Data‑Source Model

Two answer engines:

Elasticsearch (ES) search : uses built‑in tokenizers and token filters to split the query and retrieve the most similar FAQ entry.

LLM‑based model : initially a generic ChatGPT‑style model, later replaced by a domain‑specific SimBERT model for higher accuracy. LLM yields more natural answers but can hallucinate when the FAQ corpus is sparse.

An offline pipeline syncs collected FAQs and conversation logs back to the training job, continuously improving the LLM.

Statistics & Reporting Backend

Implemented as a low‑code visual console built with the internal LEGO system. Features:

One‑click cloning of the admin UI for new teams.

Conversation detail view for review.

Inline remarking and status transition.

One‑click FAQ upload that feeds both ES and the LLM pipeline.

Integration Configuration

Three steps for other teams:

Provide configuration data (e.g., WeChat Work AppSecret, on‑call roster).

Bind the unified callback API to the team’s application account or custom WebSocket endpoint.

Clone the admin console, adjust UI elements, and deploy.

Practice & Results

Iterated from a single‑department prototype to a multi‑department service. Data source evolved from pure ES to a ChatGPT/SimBERT hybrid. Deployed in several internal applications, handling ~1,000 operator tickets to date, averaging five daily resolutions. After enabling the LLM, automated resolution rate increased by ~7%.

Outlook

Future work includes:

Expose the platform as an SDK for custom dialog UI.

Automate the integration workflow with an online portal and approval checks.

Negotiate richer in‑app hand‑off capabilities with WeChat Work to reduce external group creation.

Architectural Reflections

The system uses domain‑driven microservices with clear interface boundaries. Session data is stored in partitioned tables (session, conversation record, FAQ). Synchronous user‑interface calls are decoupled from asynchronous session processing, preserving single‑responsibility per component while enabling long‑term extensibility.

Code example

相关阅读：

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Microservices Operations Elasticsearch large language model wechat-work faq-automation smart-customer-service

Written by

Architect

Professional architect sharing high‑quality architecture insights. Topics include high‑availability, high‑performance, high‑stability architectures, big data, machine learning, Java, system and distributed architecture, AI, and practical large‑scale architecture case studies. Open to ideas‑driven architects who enjoy sharing and learning.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.