Tagged articles
52 articles
Page 1 of 1
JavaGuide
JavaGuide
May 11, 2026 · Artificial Intelligence

Running Code Review and Voice Agents with Step Plan and Claude Code

The article walks through using Step Plan’s unified API to integrate Claude Code for automated code review and to build a voice‑agent pipeline that transcribes meeting recordings, generates structured summaries, and produces audio briefs, while discussing setup, costs, model selection, practical demos, and observed limitations.

AI AgentASRClaude Code
0 likes · 24 min read
Running Code Review and Voice Agents with Step Plan and Claude Code
Weekly Large Model Application
Weekly Large Model Application
Mar 30, 2026 · Artificial Intelligence

Inside Kimi-Audio: A Unified Large Audio Model Covering ASR, AQA, TTS and More

Kimi-Audio, a general‑purpose audio foundation model from Moonshot AI, integrates ASR, audio QA, automatic audio captioning, emotion classification and end‑to‑end speech dialogue within a single framework, detailing its mixed‑audio input, MiMo‑Transformer core, efficient synthesis pipeline, architectural strengths, limitations, and suitable application scenarios.

ASRAudio LLMBigVGAN
0 likes · 9 min read
Inside Kimi-Audio: A Unified Large Audio Model Covering ASR, AQA, TTS and More
Weekly Large Model Application
Weekly Large Model Application
Mar 17, 2026 · Artificial Intelligence

Essential Features Every Voice Interaction System Must Support

The article provides a comprehensive analysis of core voice interaction system capabilities—including barge‑in, turn‑taking, multi‑turn dialogue, intent recognition, speaker identification, streaming latency, noise robustness, multilingual support, emotion handling, personalization, security, and deployment considerations—highlighting typical scenarios such as smart speakers, in‑car assistants, call centers, and meeting transcription.

ASRLatencyTTS
0 likes · 11 min read
Essential Features Every Voice Interaction System Must Support
Weekly Large Model Application
Weekly Large Model Application
Mar 13, 2026 · Artificial Intelligence

Speech Large Models: Why End-to-End Architecture Beats Traditional ASR‑LLM‑TTS Pipelines

The article defines true speech large models as native end‑to‑end systems that directly map audio to audio, compares them with traditional cascade ASR‑LLM‑TTS pipelines across architecture, error control, latency, paralinguistic perception, long‑context handling and deployment, and surveys the leading open‑source and commercial speech LLMs released in March 2026 with a quick selection guide.

AIASREnd-to-End
0 likes · 11 min read
Speech Large Models: Why End-to-End Architecture Beats Traditional ASR‑LLM‑TTS Pipelines
Old Zhang's AI Learning
Old Zhang's AI Learning
Mar 7, 2026 · Artificial Intelligence

vLLM 0.17.0 Release: Full Qwen 3.5 Support and Anthropic API Compatibility

The vLLM 0.17.0 release brings FlashAttention 4 integration, a mature Model Runner V2, complete Qwen 3.5 series support, a one‑click performance‑mode flag, Anthropic API compatibility, advanced weight‑offloading, broader hardware support beyond NVIDIA, ASR model integration, and detailed upgrade and installation guidance.

ASRAnthropic APIFlashAttention
0 likes · 12 min read
vLLM 0.17.0 Release: Full Qwen 3.5 Support and Anthropic API Compatibility
Weekly Large Model Application
Weekly Large Model Application
Mar 4, 2026 · Artificial Intelligence

Qwen3‑ASR vs FunASR: In‑Depth Technical Comparison

This article provides a detailed side‑by‑side analysis of the open‑source ASR tools FunASR and Qwen3‑ASR, covering team origins, model architectures, language coverage, speed, deployment requirements, and ideal use‑cases so readers can decide which solution fits their projects best.

ASRFunASRParaformer
0 likes · 10 min read
Qwen3‑ASR vs FunASR: In‑Depth Technical Comparison
Xiaohongshu Tech REDtech
Xiaohongshu Tech REDtech
Mar 4, 2026 · Mobile Development

How Xiaohongshu Delivered Billion‑User Voice & Fireworks Effects with Adaptive Rendering

During the 2026 Chinese New Year, Xiaohongshu built a real‑time dynamic interaction system that combined adaptive scheduling, high‑performance particle rendering, and industrial‑grade ASR to deliver synchronized voice greetings and emoji fireworks to over a billion daily active users across heterogeneous mobile devices.

ASRadaptive schedulingcross-platform
0 likes · 13 min read
How Xiaohongshu Delivered Billion‑User Voice & Fireworks Effects with Adaptive Rendering
Weekly Large Model Application
Weekly Large Model Application
Feb 22, 2026 · Artificial Intelligence

2026 Guide to Running Open‑Source ASR on Pure CPU

The 2026 overview details lightweight, heavily quantized open‑source speech‑recognition models and CPU‑specific inference engines, offering concrete tips, model comparisons, and a concise selection guide that enable real‑time, GPU‑free ASR deployment with low latency and high stability.

ASRCPU inferenceModel Selection
0 likes · 4 min read
2026 Guide to Running Open‑Source ASR on Pure CPU
Old Zhang's AI Learning
Old Zhang's AI Learning
Feb 1, 2026 · Artificial Intelligence

Microsoft VibeVoice‑ASR Open‑Source: One‑Shot 60‑Minute Transcription with Speaker ID and Timestamps

Microsoft’s newly open‑sourced VibeVoice‑ASR model can transcribe up to 60‑minute audio in a single pass, preserving global context while providing built‑in speaker diarization and timestamps, supports 50+ languages, offers custom hot‑word injection, and can be deployed via Docker, Gradio, or vLLM for high‑throughput API services.

ASRDockerLoRA
0 likes · 9 min read
Microsoft VibeVoice‑ASR Open‑Source: One‑Shot 60‑Minute Transcription with Speaker ID and Timestamps
Zhuanzhuan Tech
Zhuanzhuan Tech
Dec 24, 2025 · Artificial Intelligence

Building an ASR+LLM+Vector Knowledge Base for Precise Video Ad Category Detection

This article presents a layered ASR‑LLM‑vector‑knowledge‑base pipeline that cleans speech transcripts, semantically repairs text, performs hierarchical exact and fuzzy matching, and iteratively refines mappings to accurately identify product categories in video advertisements, while detailing module functions, technical choices, and LLM parameter tuning.

ASRKnowledge BaseLLM
0 likes · 11 min read
Building an ASR+LLM+Vector Knowledge Base for Precise Video Ad Category Detection
Huolala Tech
Huolala Tech
Sep 10, 2025 · Artificial Intelligence

How AI Voice Humanization Cuts Call‑Center Costs: ASR, Smart Interrupt & TTS Deep Dive

This article examines how AI‑driven voice humanization—covering advanced ASR, intelligent interruption, and expressive TTS—addresses high labor costs, efficiency bottlenecks, and inconsistent service quality in inbound and outbound call‑center operations, presenting technical evaluations, optimization strategies, and future research directions.

AI voiceASRHumanization
0 likes · 13 min read
How AI Voice Humanization Cuts Call‑Center Costs: ASR, Smart Interrupt & TTS Deep Dive
Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
Jul 8, 2025 · Artificial Intelligence

How Video Retrieval‑Augmented Generation Transforms Multimodal AI Search

This article explains the end‑to‑end implementation of Video RAG in OpenSearch LLM, covering offline parsing, key‑frame extraction, audio transcription, slice creation, multimodal vectorization, hybrid indexing, and online query processing while addressing challenges like recall performance and long‑video efficiency.

ASRKey Frame ExtractionLLM
0 likes · 10 min read
How Video Retrieval‑Augmented Generation Transforms Multimodal AI Search
Efficient Ops
Efficient Ops
Oct 28, 2024 · Artificial Intelligence

How AI Powers Real-Time Business Hot‑Word Monitoring in Remote Banking

ICBC's remote‑banking hotline system uses AI, speech recognition and Python keyword extraction to rank inbound business volumes and surface hot‑word trends, delivering early alerts that help prevent risks, resolve customer issues, and support data‑driven decision making across millions of daily transactions.

AIASRHot Word Detection
0 likes · 4 min read
How AI Powers Real-Time Business Hot‑Word Monitoring in Remote Banking
DataFunTalk
DataFunTalk
Jun 3, 2024 · Artificial Intelligence

Deploying Speech AI Services Quickly with NVIDIA Riva

This article explains how to use NVIDIA Riva to rapidly deploy speech AI services, covering Riva's overview, Chinese ASR model updates, TTS capabilities, customization options, the Quickstart tool, and a Q&A session that clarifies deployment, model fine‑tuning, and integration with NeMo and Triton.

ASRGPU AccelerationNVIDIA Riva
0 likes · 13 min read
Deploying Speech AI Services Quickly with NVIDIA Riva
DataFunTalk
DataFunTalk
Feb 13, 2024 · Artificial Intelligence

An Overview of NVIDIA NeMo: Open‑Source Framework for Speech AI, ASR, TTS, NLP and Large Language Model Training

This article introduces NVIDIA’s open‑source NeMo framework, detailing its PyTorch‑based architecture for Speech AI, ASR and TTS training, NLP and LLM support, GPU‑optimized parallelism, pre‑trained model resources, fine‑tuning techniques, and the accompanying NeMo Aligner and Framework tools.

ASRNVIDIA NeMoPyTorch
0 likes · 18 min read
An Overview of NVIDIA NeMo: Open‑Source Framework for Speech AI, ASR, TTS, NLP and Large Language Model Training
DataFunTalk
DataFunTalk
Jan 26, 2024 · Artificial Intelligence

Efficient Deployment of Speech AI Models on GPUs

This article explains how to efficiently deploy speech AI models—including ASR and TTS—on GPUs using NVIDIA's Triton Inference Server and TensorRT, covering background challenges, GPU‑based solutions, decoding optimizations, Whisper acceleration with TensorRT‑LLM, streaming TTS improvements, voice‑cloning pipelines, future plans, and a Q&A session.

ASRGPUInference
0 likes · 20 min read
Efficient Deployment of Speech AI Models on GPUs
Ctrip Technology
Ctrip Technology
Dec 21, 2023 · Backend Development

Load Balancing ASR Services in Ctrip Call Center: Architecture and Implementation with FreeSWITCH and OpenSIPS

This article details the design, evolution, and best‑practice implementation of load‑balancing for ASR (speech‑recognition) services in Ctrip's massive call‑center, covering component architecture, MRCP integration, challenges with traditional balancers, and two practical solutions using FreeSWITCH distributor and OpenSIPS.

ASRFreeSWITCHMRCP
0 likes · 27 min read
Load Balancing ASR Services in Ctrip Call Center: Architecture and Implementation with FreeSWITCH and OpenSIPS
Ximalaya Technology Team
Ximalaya Technology Team
Dec 19, 2023 · Cloud Computing

Text-Based Audio Editing in Cloud Editing: Architecture, Features, and Performance Optimizations

The article discusses cloud-based audio editing tool architecture, focusing on text‑based editing enabled by ASR, hierarchical DOM (Word, Sentence, Paragraph), performance challenges with massive character nodes, and optimizations like viewport‑based rendering and efficient drag‑select, achieving large speed gains for long recordings.

ASRPerformance OptimizationText Editing
0 likes · 14 min read
Text-Based Audio Editing in Cloud Editing: Architecture, Features, and Performance Optimizations
Huolala Tech
Huolala Tech
Nov 23, 2023 · Artificial Intelligence

How HuoLaLa Built a Custom ASR System to Boost Accuracy and Cut Costs

This article details HuoLaLa's development of an in‑house Automatic Speech Recognition system, covering its architecture, VAD optimization, language‑model and hot‑word enhancements, punctuation restoration, task and resource scheduling, and the resulting improvements in accuracy and cost efficiency.

ASRLanguage ModelVAD
0 likes · 18 min read
How HuoLaLa Built a Custom ASR System to Boost Accuracy and Cut Costs
Bilibili Tech
Bilibili Tech
Oct 13, 2023 · Artificial Intelligence

Multimodal Video High‑Energy Segment Extraction for Dynamic Video Covers

The authors present a multimodal system that automatically extracts high‑energy video segments for dynamic covers by analyzing subtitles, audio, visual frames, and danmu, employing LLM prompt‑tuning, scene‑cut detection, and aesthetic scoring to reduce manual effort and boost click‑through rates.

ASRMultimodal AIOCR
0 likes · 14 min read
Multimodal Video High‑Energy Segment Extraction for Dynamic Video Covers
DataFunTalk
DataFunTalk
Sep 23, 2023 · Artificial Intelligence

Paraformer: An Industrial Non‑Autoregressive End‑to‑End Speech Recognition Model and Its Deployment on ModelScope

This article introduces the Paraformer non‑autoregressive end‑to‑end speech recognition model released by Alibaba DAMO Academy, details its architecture, training strategies, large‑scale performance, and provides step‑by‑step guidance for using and fine‑tuning the model on the ModelScope platform with the FunASR toolkit.

ASRModelScopeParaformer
0 likes · 13 min read
Paraformer: An Industrial Non‑Autoregressive End‑to‑End Speech Recognition Model and Its Deployment on ModelScope
DataFunTalk
DataFunTalk
Sep 19, 2023 · Artificial Intelligence

Simultaneous Speech Translation: Technical Background, System Architecture, and Key Challenges

This article reviews the technical background of simultaneous speech translation, compares offline and real‑time scenarios, details ASR and MT technologies, describes the system architecture and design strategies, and discusses the major challenges and solutions for deploying robust, low‑latency translation services.

ASRHuaweiReal-Time
0 likes · 16 min read
Simultaneous Speech Translation: Technical Background, System Architecture, and Key Challenges
58 Tech
58 Tech
Jun 21, 2023 · Artificial Intelligence

GPU Hotword Enhancement for WeNet End-to-End Speech Recognition

This article explains the design, implementation, and experimental evaluation of hot‑word augmentation in WeNet's GPU runtime, detailing how character‑ and word‑based language model scoring are extended to boost recognition of rare proper nouns in both streaming and non‑streaming ASR services.

ASRCTC decoderGPU
0 likes · 12 min read
GPU Hotword Enhancement for WeNet End-to-End Speech Recognition
DataFunSummit
DataFunSummit
Apr 18, 2023 · Artificial Intelligence

Best Practices for Deploying Speech AI on GPUs with Triton and TensorRT

This article presents comprehensive best‑practice guidelines for deploying conversational speech AI—including ASR and TTS pipelines—on GPU servers using NVIDIA Triton Inference Server and TensorRT, covering workflow overview, performance optimizations, streaming inference, and real‑world deployment tips.

ASRConversational AIGPU deployment
0 likes · 14 min read
Best Practices for Deploying Speech AI on GPUs with Triton and TensorRT
Meituan Technology Team
Meituan Technology Team
Mar 9, 2023 · Artificial Intelligence

Implementation and Practice of MRCP in Meituan Voice Interaction

This article details Meituan’s adoption of the Media Resource Control Protocol (MRCP) to standardize ASR and TTS integration, describing its architecture, key components, high‑availability deployment, and measured performance gains such as up to 55% latency reduction and a 15% increase in outbound call success rates.

ASRMRCPMeituan
0 likes · 24 min read
Implementation and Practice of MRCP in Meituan Voice Interaction
Bilibili Tech
Bilibili Tech
Feb 28, 2023 · Artificial Intelligence

High‑Quality Automatic Speech Recognition (ASR) Solutions at Bilibili: Data, Model, and Deployment Optimizations

Bilibili’s high‑quality ASR system combines large‑scale filtered business data, semi‑supervised Noisy‑Student training, an end‑to‑end CTC model with lattice‑free MMI decoding, and FP16‑optimized FasterTransformer inference on Triton, delivering top‑ranked accuracy, low latency, and scalable deployment for diverse Chinese‑English video content.

ASRBilibiliEnd-to-End
0 likes · 18 min read
High‑Quality Automatic Speech Recognition (ASR) Solutions at Bilibili: Data, Model, and Deployment Optimizations
58 Tech
58 Tech
Jan 12, 2023 · Artificial Intelligence

Efficient Conformer for End‑to‑End Speech Recognition: Model, Implementation, Streaming Inference, and Experimental Results

This article presents a comprehensive overview of the Efficient Conformer model for large‑scale end‑to‑end speech recognition, detailing its architectural improvements such as progressive downsampling and grouped multi‑head self‑attention, the PyTorch implementation in WeNet, streaming inference handling, experimental CER gains on AISHELL‑1 and production data, and future development plans.

ASREfficient ConformerModel Optimization
0 likes · 16 min read
Efficient Conformer for End‑to‑End Speech Recognition: Model, Implementation, Streaming Inference, and Experimental Results
DataFunTalk
DataFunTalk
Jul 30, 2022 · Artificial Intelligence

Technical Analysis of Huawei’s Offline Speech‑to‑Text and Length‑Constrained Speech Translation Systems in IWSLT 2022

This article reviews the IWSLT 2022 competition tasks, explains Huawei’s cascade offline speech‑to‑text translation pipeline, details four major technical innovations—including ensemble‑based ASR de‑noise, context‑aware re‑ranking, domain‑controlled training, and length‑control strategies—and presents experimental results that demonstrate Huawei’s leading performance across multiple language directions.

ASRHuaweiIWSLT
0 likes · 18 min read
Technical Analysis of Huawei’s Offline Speech‑to‑Text and Length‑Constrained Speech Translation Systems in IWSLT 2022
DataFunTalk
DataFunTalk
Jul 7, 2022 · Artificial Intelligence

Huawei Translation’s Achievements and Technical Solutions in IWSLT 2022 Speech Translation Tasks

This article reviews Huawei Translation’s top-ranking results in the IWSLT 2022 speech translation competition across speech‑to‑speech, offline speech‑to‑text, and length‑controlled translation tasks, and details their cascade and end‑to‑end technical approaches, including domain‑controlled ASR, context‑aware MT re‑ranking, and VITS‑based TTS.

ASREnd-to-EndHuawei
0 likes · 13 min read
Huawei Translation’s Achievements and Technical Solutions in IWSLT 2022 Speech Translation Tasks
DataFunSummit
DataFunSummit
Dec 3, 2021 · Artificial Intelligence

Real‑Time Voice Dialogue: Practices, Challenges, and Duplex Conversation

This article presents an in‑depth overview of Alibaba's real‑time voice dialogue system, covering the Hotline XiaoMi robot, the unique challenges of spoken interactions such as colloquialism, multimodality and duplex communication, and the research advances in ASR‑robust SLU, emotion detection, colloquial processing, and duplex conversation modeling.

ASRSLUSpeech AI
0 likes · 22 min read
Real‑Time Voice Dialogue: Practices, Challenges, and Duplex Conversation
Sohu Tech Products
Sohu Tech Products
May 12, 2021 · Artificial Intelligence

Zero‑Basis Food Sound Recognition with ASR: Theory, Workflow, and Complete Python Code

This article introduces the fundamentals of automatic speech recognition (ASR) for food‑sound classification, explains key audio representations and modeling approaches, and provides a fully runnable Python implementation using librosa, TensorFlow/Keras, and classic machine‑learning tools to train and predict on the Tianchi competition dataset.

ASRAudio ClassificationCNN
0 likes · 11 min read
Zero‑Basis Food Sound Recognition with ASR: Theory, Workflow, and Complete Python Code
58 Tech
58 Tech
Dec 11, 2020 · Artificial Intelligence

Weighted Finite State Transducers (WFST) in Traditional Speech Recognition: Principles and Optimization

This article explains the role of Weighted Finite State Transducers in conventional HMM‑based speech recognition, covering language models, pronunciation dictionaries, WFST definitions, semiring theory, composition and determinization operations, decoding graph construction (HCLG), lattice rescoring, and practical optimization techniques for real‑world scenarios.

ASRLanguage ModelWFST
0 likes · 23 min read
Weighted Finite State Transducers (WFST) in Traditional Speech Recognition: Principles and Optimization
Sohu Tech Products
Sohu Tech Products
Aug 19, 2020 · Artificial Intelligence

ASR Error Correction with BERT, ELECTRA and a Fuzzy‑Phoneme Generator: Techniques from Xiaomi AI

This article describes how Xiaomi's AI team tackles Automatic Speech Recognition (ASR) query errors by analyzing error patterns, employing BERT, ELECTRA and a soft‑masked BERT model, generating synthetic noisy data with a fuzzy‑phoneme generator, and presenting experimental results and future research directions.

ASRBERTDeep Learning
0 likes · 18 min read
ASR Error Correction with BERT, ELECTRA and a Fuzzy‑Phoneme Generator: Techniques from Xiaomi AI
58 Tech
58 Tech
Aug 19, 2020 · Artificial Intelligence

Speech Recognition in 58.com: Application Scenarios, Data Collection, Kaldi Chain Model Practice, and End‑to‑End Exploration

This article presents a comprehensive overview of how 58.com leverages large‑scale voice data from call‑center, private phone, and micro‑chat platforms, detailing data collection, annotation, Kaldi‑based chain model training, lattice‑free techniques, and end‑to‑end Transformer‑CTC models to improve Chinese speech recognition performance.

ASRChineseDeep Learning
0 likes · 16 min read
Speech Recognition in 58.com: Application Scenarios, Data Collection, Kaldi Chain Model Practice, and End‑to‑End Exploration
DataFunTalk
DataFunTalk
Jul 15, 2020 · Artificial Intelligence

ASR Error Correction with BERT, ELECTRA, and a Fuzzy‑Phoneme Generator: Methods, Experiments, and Future Directions

This article presents a comprehensive overview of automatic speech recognition (ASR) error correction techniques employed by Xiaomi's Xiao‑Ai team, detailing problem definition, related work on BERT and ELECTRA, a custom generator‑discriminator architecture with a fuzzy‑phoneme simulator, experimental results, and prospective research directions.

ASRBERTELECTRA
0 likes · 19 min read
ASR Error Correction with BERT, ELECTRA, and a Fuzzy‑Phoneme Generator: Methods, Experiments, and Future Directions
Didi Tech
Didi Tech
May 25, 2020 · Artificial Intelligence

How Didi Harnesses Cutting‑Edge Speech Recognition: From ASR Basics to Transformer Models

This article provides a comprehensive technical overview of modern speech recognition, covering Didi’s driver‑assistant and smart‑customer‑service applications, fundamental ASR concepts, classic GMM‑HMM methods, deep‑learning breakthroughs such as DNN‑HMM, CTC, attention‑based and transformer models, practical training tricks, signal‑processing steps, and multimodal fusion techniques.

ASRCTCDeep Learning
0 likes · 16 min read
How Didi Harnesses Cutting‑Edge Speech Recognition: From ASR Basics to Transformer Models
Alibaba Cloud Developer
Alibaba Cloud Developer
Apr 7, 2020 · Artificial Intelligence

How Does Alibaba’s Tmall Genie Achieve Full‑Duplex Natural Dialogue?

This article explains the concept of full‑duplex natural dialogue for Alibaba’s Tmall Genie, illustrates interaction scenarios, and details the technical solution covering device‑side management, speech recognition, language understanding, synthesis, dialogue control, duration handling, and conversation flow.

ASRHuman-Computer InteractionNLU
0 likes · 8 min read
How Does Alibaba’s Tmall Genie Achieve Full‑Duplex Natural Dialogue?
DataFunTalk
DataFunTalk
Feb 3, 2020 · Artificial Intelligence

Advances in Speech Recognition: Concepts, Deep Learning Methods, and Didi’s Applications

This article presents a comprehensive overview of modern speech recognition technology, covering basic ASR concepts, classic acoustic and language models, deep‑learning approaches such as DNN‑HMM, CTC, attention‑based and transformer models, multimodal fusion, signal‑processing pipelines, and practical deployment considerations at Didi.

ASRCTCDeep Learning
0 likes · 15 min read
Advances in Speech Recognition: Concepts, Deep Learning Methods, and Didi’s Applications
DataFunTalk
DataFunTalk
Jan 16, 2020 · Artificial Intelligence

Voice Conversion: Fundamentals, Methods, and iQIYI Applications

This article provides a comprehensive overview of voice conversion technology, covering its definition, parallel and non‑parallel data approaches, classic and deep‑learning methods such as DTW, GMM, seq2seq, PPG, VAE, Flow, GAN, and practical applications and challenges in iQIYI’s products.

ASRDeep LearningGAN
0 likes · 8 min read
Voice Conversion: Fundamentals, Methods, and iQIYI Applications
Tencent Cloud Developer
Tencent Cloud Developer
Feb 26, 2019 · Artificial Intelligence

Tencent Cloud Intelligent Speech Technology: Development, Challenges and Practical Applications

Tencent Cloud's intelligent speech platform combines high‑accuracy ASR, advanced WaveNet‑based TTS, and solutions for noise, far‑field, and dialect challenges, enabling voice input, transcription, and customer‑service bots, with real‑world deployments in finance, museums, hotels, and other industry scenarios.

ASRHuman-Computer InteractionSpeech synthesis
0 likes · 8 min read
Tencent Cloud Intelligent Speech Technology: Development, Challenges and Practical Applications