Tag

CTC

1 views collected around this technical thread.

TAL Education Technology
TAL Education Technology
Feb 28, 2020 · Artificial Intelligence

TPNN Multi‑GPU Training and Mobile Optimization for Children's Acoustic Speech Recognition Models

This article describes the TPNN deep‑learning platform’s multi‑GPU acceleration, data‑parallel BMUF training, LSTM‑CTC acoustic modeling, and a suite of mobile‑side optimizations—including model pruning, 8‑bit quantization, low‑precision matrix multiplication and mixed‑precision computation—that together achieve over 92% recognition accuracy for children’s English speech on both server and mobile devices.

BMUFCTCDeep Learning
0 likes · 15 min read
TPNN Multi‑GPU Training and Mobile Optimization for Children's Acoustic Speech Recognition Models
DataFunTalk
DataFunTalk
Feb 3, 2020 · Artificial Intelligence

Advances in Speech Recognition: Concepts, Deep Learning Methods, and Didi’s Applications

This article presents a comprehensive overview of modern speech recognition technology, covering basic ASR concepts, classic acoustic and language models, deep‑learning approaches such as DNN‑HMM, CTC, attention‑based and transformer models, multimodal fusion, signal‑processing pipelines, and practical deployment considerations at Didi.

ASRAttentionCTC
0 likes · 15 min read
Advances in Speech Recognition: Concepts, Deep Learning Methods, and Didi’s Applications
Hulu Beijing
Hulu Beijing
Apr 22, 2019 · Artificial Intelligence

How Has Speech Recognition Evolved from Traditional Methods to Modern Deep Learning?

This article reviews the fundamentals of automatic speech recognition, compares traditional MFCC‑GMM‑HMM pipelines with modern deep neural network approaches such as DNN‑HMM, LSTM‑CTC, and attention‑based models, and illustrates each evolution step with flowchart diagrams and key references.

ASRCTCDNN
0 likes · 11 min read
How Has Speech Recognition Evolved from Traditional Methods to Modern Deep Learning?
Liulishuo Tech Team
Liulishuo Tech Team
Oct 28, 2016 · Artificial Intelligence

Open‑sourcing kaldi‑ctc: Fast GPU‑Accelerated CTC End‑to‑End Speech Recognition

The article announces the open‑source release of kaldi‑ctc, a GPU‑accelerated CTC‑based end‑to‑end speech recognition toolkit built on Kaldi, warp‑ctc and cuDNN, highlighting its 5‑6× training speedup, real‑time decoding factor of 0.02, and performance comparisons on the LibriSpeech corpus.

ASRCTCDeep Learning
0 likes · 4 min read
Open‑sourcing kaldi‑ctc: Fast GPU‑Accelerated CTC End‑to‑End Speech Recognition