Tagged articles
3 articles
Page 1 of 1
Weekly Large Model Application
Weekly Large Model Application
May 1, 2026 · Artificial Intelligence

How Speech Models Turn Waveforms into Computable Tokens

The article explains why speech tokenization is essential for large audio models, outlines three core challenges, compares five major tokenization paradigms—including neural codecs with vector quantization, self‑supervised learning with clustering, continuous embeddings, ASR‑derived text tokens, and hierarchical multi‑codebook tokens—and provides practical guidance for selecting the right approach based on task requirements and trade‑offs.

audio codechierarchical tokensself-supervised learning
0 likes · 11 min read
How Speech Models Turn Waveforms into Computable Tokens
58 Tech
58 Tech
May 28, 2019 · Artificial Intelligence

Implementation of Voice Call Functionality in an Intelligent Voice Robot

This article details the architecture and implementation of the voice call module of an intelligent voice robot, covering SIP signaling establishment, RTP session handling, audio encoding/decoding, sampling, and packetization to enable automated outbound calls and multi‑round voice interactions.

AISIPTelephony
0 likes · 9 min read
Implementation of Voice Call Functionality in an Intelligent Voice Robot