Tagged articles
13 articles
Page 1 of 1
AI Engineering
AI Engineering
May 13, 2026 · Artificial Intelligence

First End‑to‑End Voice Agent Benchmark Shows Grok Leads with 52% Real‑World Success Rate

Artificial Analysis released the τ‑Voice benchmark, testing speech‑to‑speech agents across 278 real‑world customer‑service scenarios, and found the top‑performing Grok Voice Think Fast 1.0 achieves only a 52.1% task‑completion rate while average dialogue lengths stay under seven minutes.

BenchmarkGrok Voicespeech-to-speech
0 likes · 7 min read
First End‑to‑End Voice Agent Benchmark Shows Grok Leads with 52% Real‑World Success Rate
AI Engineering
AI Engineering
May 8, 2026 · Artificial Intelligence

How GPT‑Realtime‑2 Leverages GPT‑5‑Level Reasoning to Redefine Voice AI Architecture

OpenAI’s GPT‑Realtime‑2 embeds GPT‑5‑class reasoning into a continuous‑audio loop, achieving 96.6% accuracy on Big Bench Audio, offering adjustable inference intensity with latency from 1.12 s to 2.33 s, a 128 K context window, and demonstrable gains in real‑world call success rates, while prompting industry debate over pricing and competitive impact.

GPT-5GPT-Realtime-2Latency
0 likes · 5 min read
How GPT‑Realtime‑2 Leverages GPT‑5‑Level Reasoning to Redefine Voice AI Architecture
Weekly Large Model Application
Weekly Large Model Application
May 6, 2026 · Cloud Native

How OpenAI Scales Low-Latency Voice AI with WebRTC: Architecture Deep Dive

The article dissects OpenAI's engineering approach to delivering low‑latency voice AI at scale, explaining why WebRTC was chosen, how a Relay + Transceiver split solves Kubernetes integration challenges, the use of ICE ufrag for deterministic routing, and how global relay and implementation choices reduce perceived latency.

KubernetesLow latencyOpenAI
0 likes · 9 min read
How OpenAI Scales Low-Latency Voice AI with WebRTC: Architecture Deep Dive
ZhongAn Tech Team
ZhongAn Tech Team
Apr 13, 2026 · Industry Insights

This Week’s AI Pulse: GPT‑4o’s Exit, Full‑Duplex Voice, Open‑World AI & More

The weekly roundup analyzes OpenAI’s GPT‑4o leadership change, ByteDance’s Seeduplex full‑duplex voice breakthrough, JD.com and Meituan’s internal AI restrictions, Anthropic’s Claude Mythos leak and Glasswing response, Sam Altman’s AI‑society contract proposal, Anthropic’s token‑usage controversy, Google’s strategic outlook, AI‑driven marketing platforms, a 48 GB GPU performance comparison of Gemma and GPT‑OSS models, SentiAvatar’s 3D digital‑human innovation, and the launch of the low‑cost AI open‑world Elseland.

3D AvatarAIAnthropic
0 likes · 33 min read
This Week’s AI Pulse: GPT‑4o’s Exit, Full‑Duplex Voice, Open‑World AI & More
Fighter's World
Fighter's World
Jul 5, 2025 · Artificial Intelligence

Huxe Personal Audio Companion Review: Features, Roadmap, and Market Outlook

The article examines Huxe, a voice‑led AI personal audio companion created by former Google NotebookLM team members, detailing its four core features, early user experiences, product‑stage roadmap, trust and habit barriers, and the challenges it faces in commercialization and adoption.

AIHuxeProduct Review
0 likes · 14 min read
Huxe Personal Audio Companion Review: Features, Roadmap, and Market Outlook
ShiZhen AI
ShiZhen AI
May 28, 2025 · Artificial Intelligence

Claude Finally Gets Voice: Anthropic Adds Speech to Its AI Assistant

Anthropic has introduced a voice mode for Claude, enabling English users to speak and type interchangeably with five voice personalities, while a new 3D AI startup, SpAItial, showcases photorealistic room generation and researchers present INTUITOR, a confidence‑driven training method that improves AI reasoning.

AI researchAnthropicClaude
0 likes · 7 min read
Claude Finally Gets Voice: Anthropic Adds Speech to Its AI Assistant
DataFunSummit
DataFunSummit
Dec 25, 2024 · Artificial Intelligence

Design and Implementation of a Multimodal Real-Time Voice AI Teammate for Naraka: Bladepoint

This article explains the design, implementation, and underlying Agent‑Oriented‑Programming framework of NetEase Fuxi’s multimodal real‑time voice AI teammate for the mobile game ‘Naraka: Bladepoint’, highlighting its capabilities such as autonomous navigation, combat assistance, natural dialogue, teaching, and broader applications of voice technology in games.

Naraka Bladepointagent-oriented programminggame AI
0 likes · 12 min read
Design and Implementation of a Multimodal Real-Time Voice AI Teammate for Naraka: Bladepoint
JD Cloud Developers
JD Cloud Developers
Jun 20, 2024 · Artificial Intelligence

How Large Language Models Boost Courier Efficiency: From Voice Commands to Smart QA

This article explains how large language models like ChatGPT can transform courier operations by automating voice‑driven tasks, enabling intelligent question answering with retrieval‑augmented generation, extracting and splitting document content, embedding it for vector search, and delivering smart prompts and agents to improve productivity and accuracy.

AIEmbeddingLogistics
0 likes · 15 min read
How Large Language Models Boost Courier Efficiency: From Voice Commands to Smart QA
58 Tech
58 Tech
Jun 15, 2020 · Artificial Intelligence

Intelligent Voice Robot Architecture, Core Technologies, and Enterprise Applications

This article presents the engineering architecture of intelligent voice robots, detailing voice preprocessing, intent recognition, slot extraction, dialogue management, and showcases multiple enterprise use cases that improve efficiency and revenue across sales, customer service, and recruitment.

Enterprise Automationdialogue managementintent classification
0 likes · 14 min read
Intelligent Voice Robot Architecture, Core Technologies, and Enterprise Applications
Alibaba Cloud Developer
Alibaba Cloud Developer
Jul 10, 2019 · Artificial Intelligence

Inside Alibaba’s Tmall Genie: How Its Dialogue Engine Powers Conversational AI

This article explores the architecture and components of Alibaba's Tmall Genie dialogue engine, detailing its bot, skill, domain, intent, entity concepts, NLU strategies—including deep‑learning and fuzzy approaches—skill execution methods, multi‑turn handling, screen‑based interactions, public intents, and the evolution of the platform.

Dialogue EngineMulti-turn ConversationNLU
0 likes · 20 min read
Inside Alibaba’s Tmall Genie: How Its Dialogue Engine Powers Conversational AI
iQIYI Technical Product Team
iQIYI Technical Product Team
Feb 1, 2019 · Industry Insights

Will 2019 Be the Golden Harvest for AI? Insights from iQiyi’s VP

In a recent interview, iQiyi’s Vice President Xie Danming predicts that 2019 will mark a golden harvest for artificial intelligence, driven by 5G‑enabled AR/VR expansion, mature voice AI, emerging graph neural networks, and growing application‑focused investment, while highlighting challenges such as model cost, multimodal analysis, and the need for lightweight algorithms.

5GAIAR/VR
0 likes · 15 min read
Will 2019 Be the Golden Harvest for AI? Insights from iQiyi’s VP
Tencent Cloud Developer
Tencent Cloud Developer
Oct 10, 2018 · Artificial Intelligence

What Are the Real Challenges and Future Trends in Intelligent Voice Technology?

This article examines the current landscape of intelligent voice technology—including speech recognition, synthesis, voiceprint identification, and acoustic event detection—highlighting technical hurdles, evaluation metrics, recent advances such as WaveNet, and a wide range of practical applications from mobile devices to smart hardware and enterprise solutions.

Audio ProcessingSpeech synthesisTencent Cloud
0 likes · 16 min read
What Are the Real Challenges and Future Trends in Intelligent Voice Technology?