Artificial Intelligence 16 min read

Advances in Voice Interaction: 360's Intelligent Dialogue System Architecture and Core Technologies

This article presents a comprehensive overview of 360's voice interaction platform, detailing dialogue system fundamentals, platform architecture, and core technologies such as semantic understanding, dialog management, and question answering, all driven by deep learning and multimodal innovations.

DataFunTalk

Mar 19, 2020

Advances in Voice Interaction: 360's Intelligent Dialogue System Architecture and Core Technologies

With the rapid development of voice interaction technology, dialogue systems have become increasingly mature, largely driven by deep learning techniques that leverage large-scale data for feature representation and response generation, enhancing user experience.

This article shares the practical implementation of voice interaction technology at 360, covering its deployment in products such as the 360 smart speaker, children’s smartwatch, and security software.

1. Fundamentals of Dialogue Systems

A typical dialogue system pipeline consists of Automatic Speech Recognition (ASR) to convert speech to text, Natural Language Understanding (NLU) to interpret intent and slots, Dialog Manager (DM) for state tracking and policy decision, and Natural Language Generation (NLG) plus Text‑to‑Speech (TTS) for output.

Dialogue systems are categorized into task‑oriented, QA‑type, and chit‑chat systems.

2. 360 Intelligent Voice Interaction Platform

The platform adopts a modular architecture that decouples business logic from the core engine, enabling rapid skill development (≈1 week for simple skills, ≈2 weeks for complex ones) and supporting 82 built‑in skills across multiple products.

Key innovation is the multimodal access layer that introduces “events” to handle non‑textual inputs such as camera‑based mask detection.

3. Core Technologies

3.1 Semantic Understanding

Task‑type NLU extracts domain, intent, and slots; challenges include error propagation, lack of external knowledge, and OOV handling. Solutions explored include rule‑based matching, generative n‑gram models, similarity‑based retrieval, and deep models such as SF‑ID (joint slot filling and intent detection) with attention, CRF, and slot‑gate mechanisms.

External knowledge vectors improve accuracy by ~15 %.

3.2 Dialogue Management

Implemented with Dialog State Tracking (DST) and Dialog Policy (DP). Two approaches are used: Frame‑Based (slot‑filling for task‑oriented dialogs) and FSM‑Based (finite‑state machines for scripted tasks). The system also maintains contextual memory across turns, enabling cross‑scenario information inheritance.

3.3 Question Answering (QA)

The QA pipeline consists of query preprocessing, coarse retrieval (keyword‑based via Elasticsearch and embedding‑based via Faiss), fine‑ranking with an LSTM‑DSSM model, and business‑logic filtering. The LSTM‑DSSM outperforms BERT in this scenario while being computationally cheaper.

Conclusion

The article introduced the basics and workflow of voice interaction systems, described the 360 intelligent voice platform architecture, and detailed core technologies including SF‑ID semantic understanding, dialog management strategies, and QA retrieval and ranking methods.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

machine learning AI knowledge graph speech recognition voice interaction dialogue system natural language understanding

Written by

DataFunTalk

Dedicated to sharing and discussing big data and AI technology applications, aiming to empower a million data scientists. Regularly hosts live tech talks and curates articles on big data, recommendation/search algorithms, advertising algorithms, NLP, intelligent risk control, autonomous driving, and machine learning/deep learning.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.