Artificial Intelligence 14 min read

Complex Semantic Representation in Voice Assistants: NLP Layers, DIS Limitations, and the CMRL Schema

This article explains how voice assistants rely on a three‑layer NLP pipeline (lexical, syntactic, and semantic analysis), discusses the shortcomings of the traditional DIS (Domain‑Intent‑Slot) structure for complex commands, and introduces the hierarchical CMRL schema along with two neural models (copy‑write seq2seq and seq2tree) for converting natural language into structured logical expressions.

DataFunSummit

Dec 18, 2020

Complex Semantic Representation in Voice Assistants: NLP Layers, DIS Limitations, and the CMRL Schema

The talk, presented by Alibaba algorithm expert Wang Chenglong, focuses on handling complex semantic expressions in voice assistants, where natural language processing (NLP) is essential. Although NLP has matured, understanding intricate text still poses challenges.

NLP Three Layers : Voice assistants process user input through lexical analysis (tokenization, POS tagging, NER), syntactic analysis (phrase‑structure and dependency parsing), and finally semantic analysis, which aims to capture the relationships among linguistic components.

Shallow Semantic Analysis : This stage identifies predicates and their arguments, typically using Semantic Role Labeling (SRL), without constructing a full logical representation.

DIS (Domain‑Intent‑Slot) Structure : The widely used DIS model represents a command as a triple of domain, intent, and entity. While simple commands (e.g., “turn on the living‑room AC”) fit this schema, the article lists six major limitations: domain ambiguity, inability to handle cross‑domain commands, lack of multi‑entity relational representation, inability to express intent relationships, difficulty representing implicit semantics, and inability to capture fuzzy meanings.

To overcome these issues, the authors propose a new hierarchical schema called CMRL (Context‑aware Meaning Representation Language). CMRL defines six element types: Intent, Thing (object), Enum, Operator, Property, and Joiner. Each element can be nested, allowing complex logical expressions that capture multi‑intent, multi‑entity, and implicit semantic information.

Advantages of CMRL include intent reuse across domains, support for cross‑domain commands, expressive multi‑entity relationships, ordering of intents, representation of implicit and ambiguous meanings, and richer relational operators (>, <, ∈, ∉, etc.).

Semantic Parsing Algorithms : Converting natural language into CMRL expressions is treated as a translation problem. Two models are presented: (1) a copy‑and‑write seq2seq model that restricts the decoder vocabulary to schema keywords and copies tokens from the input, dramatically reducing the search space; (2) a seq2tree model that generates a hierarchical tree structure, guaranteeing syntactic correctness of the output logical form.

By combining these models, the system can accurately parse complex voice‑assistant commands into CMRL, enabling more robust understanding and execution of user intents.

The presentation concludes with acknowledgments and community information.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

NLP Seq2Seq seq2tree semantic parsing CMRL semantic schema voice assistants

Written by

DataFunSummit

Official account of the DataFun community, dedicated to sharing big data and AI industry summit news and speaker talks, with regular downloadable resource packs.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.