Tagged articles
10 articles
Page 1 of 1
Weekly Large Model Application
Weekly Large Model Application
May 5, 2026 · Artificial Intelligence

How Audio Waveforms Are Turned Into Model‑Readable Tokens

The article explains why raw audio cannot be fed directly to language models, outlines the two essential compression steps, compares three common tokenization approaches—neural codecs, self‑supervised clustering, and continuous vectors—and warns of typical pitfalls for newcomers.

audio tokenizationlarge language modelsneural codecs
0 likes · 6 min read
How Audio Waveforms Are Turned Into Model‑Readable Tokens
AI Frontier Lectures
AI Frontier Lectures
Mar 30, 2025 · Artificial Intelligence

Do Large Language Models Mirror Human Brain Language Processing? Google’s Groundbreaking Findings

Google researchers discovered a linear relationship between brain activity recorded during natural conversation and the internal embeddings of a speech‑to‑text large language model, revealing that acoustic and lexical representations from the model can accurately predict neural responses in both language comprehension and production.

AI researchGooglebrain imaging
0 likes · 8 min read
Do Large Language Models Mirror Human Brain Language Processing? Google’s Groundbreaking Findings
DataFunSummit
DataFunSummit
Oct 20, 2022 · Artificial Intelligence

End-to-End Speech Relation Extraction

This paper presents an end‑to‑end approach for extracting relational triples directly from speech signals, bypassing intermediate transcription, and demonstrates its effectiveness on synthesized speech versions of the CoNLL04 and TACRED datasets, highlighting challenges such as length constraints and cross‑modal alignment.

End-to-Endmultimodalnatural language processing
0 likes · 17 min read
End-to-End Speech Relation Extraction
NetEase Smart Enterprise Tech+
NetEase Smart Enterprise Tech+
Feb 23, 2021 · Artificial Intelligence

How Deep Learning Detects Pornographic and ASMR Audio

This article explains a deep‑learning pipeline that preprocesses audio, extracts FBank features, applies SpecAugment, and uses a CNN‑BI‑LSTM‑Attention model to automatically identify pornographic and ASMR speech for content moderation.

ASMR detectionAudio ClassificationSpecAugment
0 likes · 8 min read
How Deep Learning Detects Pornographic and ASMR Audio
58 Tech
58 Tech
Nov 16, 2020 · Artificial Intelligence

Iterative Optimization of Voice Endpoint Detection for Voice Robots: From Dual‑Threshold to WebRTC VAD and VADNet

This article details the evolution of the voice endpoint detection (VAD) module in 58.com’s voice robot, comparing a dual‑threshold method, Google’s WebRTC VAD, and the deep‑learning based VADNet, and presents experimental results on accuracy, recall, F1 score and online latency.

VADVoice Activity Detectionreal-time communication
0 likes · 14 min read
Iterative Optimization of Voice Endpoint Detection for Voice Robots: From Dual‑Threshold to WebRTC VAD and VADNet
JD Cloud Developers
JD Cloud Developers
Oct 27, 2020 · Artificial Intelligence

How JD AI’s Four Interspeech 2020 Papers Advance Speech Processing

JD AI Research Institute presented four accepted Interspeech 2020 papers—covering sound event localization, speech dereverberation, speaker verification, and an efficient WaveGlow vocoder—demonstrating significant advances in audio AI despite the conference’s shift to an online format due to COVID‑19.

Audio AIneural vocodersound event detection
0 likes · 8 min read
How JD AI’s Four Interspeech 2020 Papers Advance Speech Processing
58 Tech
58 Tech
May 28, 2019 · Artificial Intelligence

Implementation of Voice Call Functionality in an Intelligent Voice Robot

This article details the architecture and implementation of the voice call module of an intelligent voice robot, covering SIP signaling establishment, RTP session handling, audio encoding/decoding, sampling, and packetization to enable automated outbound calls and multi‑round voice interactions.

AISIPTelephony
0 likes · 9 min read
Implementation of Voice Call Functionality in an Intelligent Voice Robot