Ray's Galactic Tech
Ray's Galactic Tech
Apr 16, 2026 · Artificial Intelligence

How to Turn FunASR into a Production‑Ready Real‑Time Speech Platform: From Single‑Node Demo to Million‑Scale Architecture

This article explains how to evolve FunASR from a simple demo into a production‑grade, low‑latency, high‑concurrency streaming speech‑recognition system by addressing model inference, session state, scaling layers, Kubernetes deployment, monitoring, and common pitfalls for real‑world use cases such as call‑center quality inspection.

FunASRProduction ArchitectureReal-time Speech Recognition
0 likes · 38 min read
How to Turn FunASR into a Production‑Ready Real‑Time Speech Platform: From Single‑Node Demo to Million‑Scale Architecture
Meituan Technology Team
Meituan Technology Team
Apr 13, 2023 · Artificial Intelligence

Peak-First Regularization for Low-Latency Streaming Speech Recognition

The paper presents a low‑latency streaming speech‑recognition solution that reframes latency reduction as a knowledge‑distillation task, using a simple peak‑first regularization term to shift CTC output probabilities leftward and achieve up to 200 ms average latency reduction without harming word error rate.

CTCKnowledge DistillationLatency Reduction
0 likes · 21 min read
Peak-First Regularization for Low-Latency Streaming Speech Recognition
ITPUB
ITPUB
Jul 25, 2022 · Artificial Intelligence

How an AI Interview Bot Scaled 20× Faster with Backend Architecture Optimizations

This article details the design of an AI interview robot for 58.com, covering its backend architecture, dialogue engine, resource‑management strategies, performance‑testing methodology, and the optimizations that boosted concurrent interview capacity by twenty times while improving user experience.

AI Interview BotDialogue EngineRTP
0 likes · 17 min read
How an AI Interview Bot Scaled 20× Faster with Backend Architecture Optimizations