Tagged articles
121 articles
Page 2 of 2
Tencent Cloud Developer
Tencent Cloud Developer
Feb 26, 2019 · Artificial Intelligence

Tencent Cloud Intelligent Speech Technology: Development, Challenges and Practical Applications

Tencent Cloud's intelligent speech platform combines high‑accuracy ASR, advanced WaveNet‑based TTS, and solutions for noise, far‑field, and dialect challenges, enabling voice input, transcription, and customer‑service bots, with real‑world deployments in finance, museums, hotels, and other industry scenarios.

ASRHuman-Computer InteractionSpeech synthesis
0 likes · 8 min read
Tencent Cloud Intelligent Speech Technology: Development, Challenges and Practical Applications
Ctrip Technology
Ctrip Technology
Feb 21, 2019 · Artificial Intelligence

Speech Recognition and Synthesis: Principles, Challenges, Optimizations, and Tencent Cloud Use Cases

This article reviews the development roadmap, current industry status, challenges, typical deployment scenarios, and optimization methods for speech recognition (ASR) and speech synthesis (TTS), and shares several Tencent Cloud intelligent voice case studies to illustrate practical applications.

AISpeech synthesiscloud computing
0 likes · 9 min read
Speech Recognition and Synthesis: Principles, Challenges, Optimizations, and Tencent Cloud Use Cases
Alibaba Cloud Developer
Alibaba Cloud Developer
Feb 12, 2019 · Artificial Intelligence

Essential AI Research Highlights to Jump‑Start Your Post‑Holiday Learning

After the Chinese New Year break, this curated collection of key AI articles—spanning computer vision, speech recognition, natural language processing, recommendation systems, and more—helps technical readers quickly regain momentum in work and study by revisiting core technologies with real‑world case studies.

AIComputer Visionspeech recognition
0 likes · 6 min read
Essential AI Research Highlights to Jump‑Start Your Post‑Holiday Learning
MaGe Linux Operations
MaGe Linux Operations
Feb 1, 2019 · Artificial Intelligence

Master Python Speech Recognition: Install, Process Audio Files, and Capture Live Voice

This guide walks you through the fundamentals of speech recognition, explains how modern systems work, shows how to choose and install the Python SpeechRecognition package, and demonstrates processing audio files, handling noise, using offsets, and capturing live microphone input with practical code examples.

audio-processingmachine-learningmicrophone
0 likes · 16 min read
Master Python Speech Recognition: Install, Process Audio Files, and Capture Live Voice
JD Tech
JD Tech
Jan 16, 2019 · Artificial Intelligence

Technical Deep Dive of JD’s Intelligent Customer Service 2.0: AI‑Driven Intent Recognition, Emotion Analysis, and Smart Scheduling

This article presents a comprehensive technical analysis of JD’s Intelligent Customer Service 2.0, detailing AI‑based intent recognition with the ABSQ framework, hierarchical attention networks, emotion analysis via CNN, speech navigation using ASR/NLP, and machine‑learning‑driven smart dispatch that together boost accuracy and user experience.

AIcustomer-serviceemotion analysis
0 likes · 10 min read
Technical Deep Dive of JD’s Intelligent Customer Service 2.0: AI‑Driven Intent Recognition, Emotion Analysis, and Smart Scheduling
Tencent Cloud Developer
Tencent Cloud Developer
Dec 27, 2018 · Artificial Intelligence

Overview of Speech and Semantic Recognition Technologies Presented at the Tencent Cloud+ Community Developer Conference

At the inaugural Tencent Cloud+ Community Developer Conference, experts detailed the evolution of speech and semantic recognition—from early MFCC/HMM‑GMM models to modern end‑to‑end deep‑learning architectures—and showcased WeChat Zhiling’s full‑stack platform, multilingual models, high‑accuracy cloud services, translation solutions, legal applications, and integration into smart devices.

AITencent Cloudnatural language processing
0 likes · 9 min read
Overview of Speech and Semantic Recognition Technologies Presented at the Tencent Cloud+ Community Developer Conference
Tencent Cloud Developer
Tencent Cloud Developer
Oct 10, 2018 · Artificial Intelligence

What Are the Real Challenges and Future Trends in Intelligent Voice Technology?

This article examines the current landscape of intelligent voice technology—including speech recognition, synthesis, voiceprint identification, and acoustic event detection—highlighting technical hurdles, evaluation metrics, recent advances such as WaveNet, and a wide range of practical applications from mobile devices to smart hardware and enterprise solutions.

Audio ProcessingSpeech synthesisTencent Cloud
0 likes · 16 min read
What Are the Real Challenges and Future Trends in Intelligent Voice Technology?
Tencent Cloud Developer
Tencent Cloud Developer
Sep 30, 2018 · Artificial Intelligence

Smart Speaker Voice Interaction Technology: Recent Advances and Tencent's Research Progress

The article surveys Tencent’s recent advances in smart‑speaker voice interaction, detailing a full technology chain—from front‑end capture, wake‑up and enhancement, through speaker verification and short‑speech voiceprint, to TDNN/LSTM speech recognition, target speaker extraction, and end‑to‑end attention modeling for robust, personalized performance.

Attention MechanismTTSmicrophone array
0 likes · 18 min read
Smart Speaker Voice Interaction Technology: Recent Advances and Tencent's Research Progress
Tencent Cloud Developer
Tencent Cloud Developer
Sep 26, 2018 · Artificial Intelligence

Breakthroughs in AI: Deep Learning Applications in Speech Recognition

The talk reviews how massive speech data, faster GPUs/CPUs, and deep‑learning models such as DNN, LSTM, CNN, and end‑to‑end CTC have dramatically boosted speech‑recognition accuracy, while outlining remaining challenges like noise, accents, far‑field and multi‑speaker scenarios and describing Tencent Cloud’s related services.

AINeural Networksacoustic modeling
0 likes · 16 min read
Breakthroughs in AI: Deep Learning Applications in Speech Recognition
iQIYI Technical Product Team
iQIYI Technical Product Team
Sep 14, 2018 · Artificial Intelligence

Limitations of Language Models in Voice Interaction and HomeAI Solutions

iQIYI HomeAI tackles the bottleneck of static language models in voice assistants by separating phonetic and semantic processing, correcting ASR errors at the intent‑recognition layer with pinyin‑enhanced entity correction, thereby reducing error amplification in video‑on‑demand interactions and paving the way for adaptive, personalized voice experiences.

AILanguage Modelintent recognition
0 likes · 7 min read
Limitations of Language Models in Voice Interaction and HomeAI Solutions
Alibaba Cloud Developer
Alibaba Cloud Developer
Jul 18, 2018 · Artificial Intelligence

Inside Alibaba’s Postdoc Labs: Real‑World AI Research and Innovation

Alibaba’s post‑doctoral program connects PhDs with massive industry data and real‑world projects, showcasing how researchers like Xue Shaofei and Pei Changhua develop cutting‑edge speech‑recognition, scheduling and recommendation technologies that directly impact millions of users.

AIIndustry-Academia CollaborationPostdoctoral Research
0 likes · 9 min read
Inside Alibaba’s Postdoc Labs: Real‑World AI Research and Innovation
Didi Tech
Didi Tech
Jun 1, 2018 · Artificial Intelligence

Didi's Attention-Based End-to-End Mandarin Speech Recognition: A Detailed Review

Didi’s attention‑based end‑to‑end Mandarin speech recognizer, built on the Listen‑Attend‑Spell architecture and modeling roughly 5,000 common characters, delivers 15‑25% relative accuracy gains over its prior LSTM‑CTC system while cutting model size, latency and server requirements and simplifying training by eliminating separate acoustic, pronunciation and language components.

End-to-EndLASMandarin
0 likes · 14 min read
Didi's Attention-Based End-to-End Mandarin Speech Recognition: A Detailed Review
High Availability Architecture
High Availability Architecture
May 28, 2018 · Artificial Intelligence

Interview with GIAC AI Forum Lecturer Long Mingkang on Building AI Platforms, Speech Recognition Challenges, and Future AI Trends

In this interview, Long Mingkang, Vice President of iFlytek's Cloud Computing Institute, shares his experience building large‑scale speech cloud services, discusses the technical hurdles of speech recognition and AI platform development, compares TensorFlow and MXNet, and offers insights on AutoML, industry trends, and how engineers can master AI.

AIAI PlatformsAutoML
0 likes · 13 min read
Interview with GIAC AI Forum Lecturer Long Mingkang on Building AI Platforms, Speech Recognition Challenges, and Future AI Trends
Liulishuo Tech Team
Liulishuo Tech Team
Sep 3, 2017 · Artificial Intelligence

Report on Interspeech 2017 and SLaTE 2017: Highlights in Speech Recognition, Synthesis, and Speaker Technologies

The article reports on Liulishuo’s participation in Interspeech 2017 and the SLaTE 2017 workshop, summarizing key research papers on noise‑robust ASR, attention‑based models, TensorFlow training, modern TTS systems, speaker identification datasets, and includes a hiring announcement for AI engineers.

AIInterspeechSpeech synthesis
0 likes · 7 min read
Report on Interspeech 2017 and SLaTE 2017: Highlights in Speech Recognition, Synthesis, and Speaker Technologies
Alibaba Cloud Developer
Alibaba Cloud Developer
Mar 17, 2017 · Artificial Intelligence

How Improved Latency‑Controlled BLSTM Models Boost Online Speech Recognition Efficiency

This article explains how latency‑controlled BLSTM acoustic models were refined to accelerate online speech recognition while preserving accuracy, detailing the training strategy, computational trade‑offs, and two model enhancements that achieve up to 60% faster decoding with modest resource savings.

Deep LearningLC-BLSTMacoustic modeling
0 likes · 6 min read
How Improved Latency‑Controlled BLSTM Models Boost Online Speech Recognition Efficiency
Alibaba Cloud Developer
Alibaba Cloud Developer
Oct 11, 2016 · Artificial Intelligence

What Were the Key Speech AI Breakthroughs at Interspeech 2016?

The Interspeech 2016 conference in San Francisco showcased major advances in speech recognition, synthesis, far‑field processing, and language modeling, highlighting CTC extensions, deep CNN innovations, WaveNet’s generative audio, and new techniques for multi‑microphone acoustic modeling.

CTCDeep LearningInterspeech 2016
0 likes · 7 min read
What Were the Key Speech AI Breakthroughs at Interspeech 2016?
Ctrip Technology
Ctrip Technology
Aug 12, 2016 · Mobile Development

Design and Development of a Siri‑Like Voice‑Controlled Music iOS App

This article walks through the design and implementation of a voice‑controlled music iOS application using Siri SDK, Sketch and Principle for UI prototyping, and Xcode with Objective‑C and SpeechKit for speech recognition, culminating in a functional prototype that searches iTunes and plays song previews.

Mobile DevelopmentObjective‑CSiri SDK
0 likes · 8 min read
Design and Development of a Siri‑Like Voice‑Controlled Music iOS App
21CTO
21CTO
Dec 9, 2015 · Artificial Intelligence

iFLY Mobile Speech Platform: Enabling Voice Recognition and Synthesis

iFLY’s Mobile Speech Platform (MSP) integrates cloud‑based speech recognition and text‑to‑speech technologies to deliver high‑quality, multi‑channel voice services for Android, iOS and other devices, detailing its four‑layer architecture, core functionalities, and the role of ASR and TTS in modern human‑machine interaction.

Mobile Developmentartificial intelligencecloud architecture
0 likes · 5 min read
iFLY Mobile Speech Platform: Enabling Voice Recognition and Synthesis