Tag

speech processing

0 views collected around this technical thread.

Tencent Tech
Tencent Tech
Jan 20, 2023 · Artificial Intelligence

Why Tencent’s Yu Dong Became a 2022 ACM Fellow: Pioneering Deep Learning in Speech

Tencent AI Lab deputy Yu Dong was named a 2022 ACM Fellow for his groundbreaking work in speech processing and deep‑learning applications, boasting over 100 patents, multiple best‑paper awards, and technologies now embedded in many of Tencent’s products.

ACM FellowArtificial IntelligenceDeep Learning
0 likes · 5 min read
Why Tencent’s Yu Dong Became a 2022 ACM Fellow: Pioneering Deep Learning in Speech
DataFunSummit
DataFunSummit
Oct 20, 2022 · Artificial Intelligence

End-to-End Speech Relation Extraction

This paper presents an end‑to‑end approach for extracting relational triples directly from speech signals, bypassing intermediate transcription, and demonstrates its effectiveness on synthesized speech versions of the CoNLL04 and TACRED datasets, highlighting challenges such as length constraints and cross‑modal alignment.

MultimodalNatural Language ProcessingRelation Extraction
0 likes · 17 min read
End-to-End Speech Relation Extraction
58 Tech
58 Tech
Nov 16, 2020 · Artificial Intelligence

Iterative Optimization of Voice Endpoint Detection for Voice Robots: From Dual‑Threshold to WebRTC VAD and VADNet

This article details the evolution of the voice endpoint detection (VAD) module in 58.com’s voice robot, comparing a dual‑threshold method, Google’s WebRTC VAD, and the deep‑learning based VADNet, and presents experimental results on accuracy, recall, F1 score and online latency.

CRNNDeep LearningReal-time Communication
0 likes · 14 min read
Iterative Optimization of Voice Endpoint Detection for Voice Robots: From Dual‑Threshold to WebRTC VAD and VADNet
58 Tech
58 Tech
Jul 31, 2020 · Artificial Intelligence

Intelligent Voice Quality Inspection System: Architecture, Core Technologies, and Business Cases

This article presents 58.com’s intelligent voice quality inspection system, detailing its overall architecture, speech separation, speaker role identification, NLP‑based tagging, model choices such as VGG, BERT, ALBERT and SPTM, and real‑world call‑center use cases that improve efficiency and reduce risk.

AINLPcall center
0 likes · 20 min read
Intelligent Voice Quality Inspection System: Architecture, Core Technologies, and Business Cases
58 Tech
58 Tech
May 28, 2019 · Artificial Intelligence

Implementation of Voice Call Functionality in an Intelligent Voice Robot

This article details the architecture and implementation of the voice call module of an intelligent voice robot, covering SIP signaling establishment, RTP session handling, audio encoding/decoding, sampling, and packetization to enable automated outbound calls and multi‑round voice interactions.

AISIPVoice Bot
0 likes · 9 min read
Implementation of Voice Call Functionality in an Intelligent Voice Robot