Artificial Intelligence 12 min read

Complete Solution of 58.com Human-Machine Voice Dialogue Robot: Architecture, Core Modules, and Application Scenarios

This article presents the end‑to‑end solution of 58.com’s voice dialogue robot, detailing its overall architecture, intelligent outbound process, core functions such as call dialing, status recognition, dialogue management, intent detection, and showcasing multiple real‑world application scenarios that improve sales, operations, and customer service efficiency.

DataFunTalk
DataFunTalk
DataFunTalk
Complete Solution of 58.com Human-Machine Voice Dialogue Robot: Architecture, Core Modules, and Application Scenarios

Background: 58.com is China’s largest life‑information service platform, covering recruitment, automotive, finance, local services, second‑hand goods, etc. Phone communication is a crucial channel for many business processes such as recruitment interview confirmations.

To better serve B‑side merchants and C‑side users and fulfill the mission “make life simple and beautiful”, a voice robot was developed to reduce manual call workload and improve service quality.

Overall Architecture:

The architecture consists of five layers:

Access layer: Provides API interfaces for business systems; after a call ends, results are delivered asynchronously via WMB, and business sides can feed back follow‑up information for model optimization.

Web management layer: Handles script configuration, permission control, batch dialing, anti‑spam strategy settings, and data visualization.

Logic layer: Core control layer of the robot, analogous to a brain, ensuring the dialogue flow is complete.

Editing/annotation layer: Used for data labeling, which feeds model iteration and online performance evaluation.

Basic services layer: Includes SIP telephony resources and speech‑recognition/synthesis interfaces (e.g., Alibaba, Tencent). SIP resources provide call establishment and release; third‑party ASR services convert speech to text.

Intelligent Outbound Process:

The process is divided into three stages: pre‑call, in‑call, and post‑call.

Pre‑call: The caller passes the called number, business scenario, etc., to the outbound engine, which sets anti‑spam logic, selects an appropriate SIP provider, establishes SIP communication, and loads the relevant script.

In‑call: After the opening greeting, the system sends synthesized speech via SIP to the user. The user’s response is captured, streamed to a third‑party ASR service, converted to text, analyzed, and routed according to dialogue logic to generate appropriate replies.

Post‑call: The robot evaluates call status, performs whole‑round intent recognition, stores data, and callbacks the business side via WMB.

Core Functions:

Call dialing service: Establishes telephone connections with customers.

Call status recognition: Determines whether the called number is valid (e.g., empty, busy) using SIP response codes and ring‑tone classification.

Intelligent dialogue interaction: Includes dialogue management, DTMF capture, single‑sentence intent recognition (TextCNN, BERT), standard question matching (Bi‑LSTM‑DSSM, BERT), and slot extraction (IDCNN+CRF).

Whole‑round intent recognition: Aggregates all user utterances in a session and classifies the overall intent (e.g., SUCCESS, CENTRAL, REFUSED) using TextCNN and model ensembles.

Application Scenarios:

Notification – informing users of changes via voice calls.

Satisfaction survey – post‑service satisfaction collection.

Information verification – confirming the authenticity of user data.

Sales – identifying and nurturing potential customers.

Alarm – notifying responsible personnel of system anomalies.

Sales‑customer service training – using the robot for preliminary training before human agents take over.

Four concrete cases are illustrated: campus recruitment efficiency, customer‑service efficiency, operation efficiency, and sales efficiency, each showing how the voice robot improves productivity and reduces manual effort.

Conclusion: Voice dialogue technology is not only deployed in sales but also in notifications, internal alerts, and other business modules. The article systematically introduces the robot’s overall architecture, core capabilities, and practical deployment scenarios at 58.com.

AIspeech recognitionDialogue ManagementTelephonyintent detectionvoice chatbot
DataFunTalk
Written by

DataFunTalk

Dedicated to sharing and discussing big data and AI technology applications, aiming to empower a million data scientists. Regularly hosts live tech talks and curates articles on big data, recommendation/search algorithms, advertising algorithms, NLP, intelligent risk control, autonomous driving, and machine learning/deep learning.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.