Tagged articles
17 articles
Page 1 of 1
AI Explorer
AI Explorer
Apr 14, 2026 · Artificial Intelligence

Voicebox: Open-Source Offline Voice Cloning and Synthesis Studio

Voicebox is a rapidly popular open‑source TTS platform that runs entirely on a local machine, offering multi‑engine support, fast voice cloning, rich audio effects, a timeline‑based story editor, and an API‑first design for developers, creators, and privacy‑sensitive applications.

API-firstTauriTypeScript
0 likes · 6 min read
Voicebox: Open-Source Offline Voice Cloning and Synthesis Studio
AI Explorer
AI Explorer
Apr 11, 2026 · Artificial Intelligence

VoxCPM2: Tokenizer‑Free Multilingual TTS that Creates New Voices from Text

VoxCPM2, an open‑source 2‑billion‑parameter TTS model from OpenBMB, eliminates tokenizers and uses a diffusion‑autoregressive architecture to generate high‑fidelity, controllable speech in 30 languages, supporting voice design from natural‑language prompts and high‑quality voice cloning with just a short reference clip.

AudioVAETTSVoxCPM2
0 likes · 8 min read
VoxCPM2: Tokenizer‑Free Multilingual TTS that Creates New Voices from Text
Old Zhang's AI Learning
Old Zhang's AI Learning
Jan 24, 2026 · Artificial Intelligence

Open-Source Qwen3‑TTS: Sub‑100 ms Latency, Runs on 8 GB GPU, and ComfyUI Integration

Qwen3‑TTS, an open‑source text‑to‑speech model from Alibaba, offers sub‑100 ms first‑packet latency, supports voice cloning, natural‑language voice design, and ten languages, can be deployed locally on a GPU with as little as 8 GB VRAM, and integrates with ComfyUI for visual workflow building.

ComfyUILow latencyQwen3-TTS
0 likes · 15 min read
Open-Source Qwen3‑TTS: Sub‑100 ms Latency, Runs on 8 GB GPU, and ComfyUI Integration
HyperAI Super Neural
HyperAI Super Neural
Jan 3, 2026 · Artificial Intelligence

Clone a Voice in 5 seconds with One‑Step Generation: Inside Chatterbox‑Turbo’s High‑Fidelity TTS

Resemble AI’s open‑source Chatterbox‑Turbo reduces TTS generation from ten steps to one, enabling high‑sample‑rate, lossless voice cloning from a 5‑10 second reference while supporting emotional control, side‑language tags, and embedded watermarking for real‑time applications across chatbots, games, podcasts, and education.

Chatterbox‑TurboReal-time inferenceknowledge distillation
0 likes · 7 min read
Clone a Voice in 5 seconds with One‑Step Generation: Inside Chatterbox‑Turbo’s High‑Fidelity TTS
Baidu Tech Salon
Baidu Tech Salon
Dec 8, 2025 · Artificial Intelligence

How Baidu’s HuiBosheng AI Live Platform Generates Super‑Human Scripts and Real‑Time Interaction

The article details Baidu HuiBosheng's end‑to‑end AI live‑streaming platform, covering merchant workflow, multimodal product understanding, style‑aware script generation, reinforcement‑learning‑driven smart control, voice and avatar cloning, and a data‑flywheel that continuously improves model performance, illustrated with real‑world GMV results.

AIData FlywheelScript Generation
0 likes · 20 min read
How Baidu’s HuiBosheng AI Live Platform Generates Super‑Human Scripts and Real‑Time Interaction
DataFunSummit
DataFunSummit
Sep 7, 2025 · Artificial Intelligence

How NIO Cut Radio Production Costs by 80% with AI Voice Cloning

This article details NIO's AI‑driven voice‑cloning solution for its in‑car NIO Radio, explaining the business background, pain points of traditional production, the TTS‑VC framework and modular workflow, evaluation metrics, and the resulting cost savings, efficiency gains, and scalability across dozens of cities.

AICost reductionSpeech synthesis
0 likes · 10 min read
How NIO Cut Radio Production Costs by 80% with AI Voice Cloning
ZhongAn Tech Team
ZhongAn Tech Team
Jan 12, 2025 · Artificial Intelligence

AI Weekly Digest Issue 10: Market Insights, Industry Solutions, and Notable Technologies

This issue reviews recent AI industry developments, including Lee Kai‑fu’s clarification on Zero‑One’s strategy, Microsoft’s open‑source Phi‑4 model, the multimodal VITA‑1.5 release, and HaiLuo AI’s advanced Chinese voice‑cloning technology, providing technical details and market implications.

AImultimodalvoice cloning
0 likes · 10 min read
AI Weekly Digest Issue 10: Market Insights, Industry Solutions, and Notable Technologies
System Architect Go
System Architect Go
Nov 28, 2024 · Artificial Intelligence

An Overview of Modern AI Audio Technologies: ASR, TTS, and Voice Cloning

This article explains how modern AI advances have transformed audio processing, covering digital audio fundamentals, automatic speech recognition (ASR), text‑to‑speech (TTS), voice cloning techniques, and provides practical Python code examples using OpenAI Whisper and HuggingFace TTS models.

AIAudio Processingspeech recognition
0 likes · 7 min read
An Overview of Modern AI Audio Technologies: ASR, TTS, and Voice Cloning
Full-Stack DevOps & Kubernetes
Full-Stack DevOps & Kubernetes
Jul 29, 2024 · Artificial Intelligence

How to Run Real‑Time Voice Cloning with Python: A Step‑by‑Step Guide

This guide introduces the open‑source Realtime Voice Cloning project, explains its key features, and provides detailed installation and usage instructions—including environment setup, dependency installation, cloning the repository, and running the demo tool—to enable real‑time voice transformation with Python.

AIPythonReal-time Audio
0 likes · 5 min read
How to Run Real‑Time Voice Cloning with Python: A Step‑by‑Step Guide
58 Tech
58 Tech
Aug 25, 2023 · Artificial Intelligence

Voice Cloning Technology in AI Sales Assistant

This article introduces the AI sales assistant from 58.com, detailing its background, a few‑shot voice cloning approach using real dialogue data, multi‑accent naturalness optimization, deployment architecture, and future plans, while evaluating performance metrics and discussing challenges in speech synthesis quality and stability.

AI sales assistantFew‑Shot LearningSpeech synthesis
0 likes · 19 min read
Voice Cloning Technology in AI Sales Assistant
DataFunSummit
DataFunSummit
Aug 15, 2023 · Artificial Intelligence

AI Sales Assistant: Few‑Shot Voice Cloning and Multi‑Accent Naturalness Optimization

The article presents 58 Tongcheng AI Lab's AI sales assistant, detailing its background, a few‑shot voice‑cloning pipeline built on real dialogue data, data preprocessing, FastSpeech2‑based acoustic modeling, multi‑accent style transfer, deployment architecture, controllable synthesis parameters, and future research directions.

AI sales assistantFastspeech2Speech synthesis
0 likes · 20 min read
AI Sales Assistant: Few‑Shot Voice Cloning and Multi‑Accent Naturalness Optimization
iQIYI Technical Product Team
iQIYI Technical Product Team
Jun 11, 2021 · Artificial Intelligence

iQIYI M2VoC Multi‑Speaker Multi‑Style Voice Cloning Challenge at ICASSP 2021 – Overview and Results

The iQIYI M2VoC competition at ICASSP 2021, the first low‑resource multi‑speaker, multi‑style voice‑cloning challenge, attracted 153 academic and industry teams to tackle few‑shot (100 utterances) and extreme few‑shot (5 utterances) tracks, evaluated by professional listeners, yielding strong innovations and applications while confirming that single‑sample cloning remains unsolved.

AIAudio ProcessingICASSP2021
0 likes · 7 min read
iQIYI M2VoC Multi‑Speaker Multi‑Style Voice Cloning Challenge at ICASSP 2021 – Overview and Results
iQIYI Technical Product Team
iQIYI Technical Product Team
Jan 15, 2021 · Artificial Intelligence

How AI is Transforming Video Creation and Consumption at Scale

The article examines how iQIYI leverages AI across the video ecosystem—from intelligent material search, old‑film restoration, and voice cloning to virtual idols, XR production, and AI‑driven advertising—to boost creator efficiency, enhance user experience, and accelerate industry-wide digital transformation.

AIComputer Visioncontent creation
0 likes · 14 min read
How AI is Transforming Video Creation and Consumption at Scale
iQIYI Technical Product Team
iQIYI Technical Product Team
Nov 20, 2020 · Artificial Intelligence

iQIYI M2VoC Multi‑Speaker Multi‑Style Voice Cloning Challenge (ICASSP 2021) Overview

The iQIYI M2VoC Challenge at ICASSP 2021 invites researchers to tackle low‑resource multi‑speaker, multi‑style voice cloning by providing Mandarin datasets, few‑shot and extremely few‑shot tracks with strict data rules, MOS‑based subjective evaluation, and a $9,600 prize pool for top submissions.

AIChallengeICASSP
0 likes · 10 min read
iQIYI M2VoC Multi‑Speaker Multi‑Style Voice Cloning Challenge (ICASSP 2021) Overview
Liangxu Linux
Liangxu Linux
Sep 3, 2019 · Artificial Intelligence

Clone Any Voice in Seconds with the Real-Time-Voice-Cloning Open‑Source TTS

This guide explains how the Real-Time-Voice-Cloning project uses deep‑learning text‑to‑speech techniques to generate a voice clone from a short audio sample, covering the underlying principle, required dataset, setup steps, demo usage, and ethical considerations.

Deep LearningReal-Time-Voice-Cloningtext-to-speech
0 likes · 5 min read
Clone Any Voice in Seconds with the Real-Time-Voice-Cloning Open‑Source TTS