How to Turn FunASR into a Production‑Ready Real‑Time Speech Platform: From Single‑Node Demo to Million‑Scale Architecture

This article explains how to evolve FunASR from a simple demo into a production‑grade, low‑latency, high‑concurrency streaming speech‑recognition system by addressing model inference, session state, scaling layers, Kubernetes deployment, monitoring, and common pitfalls for real‑world use cases such as call‑center quality inspection.

FunASRProduction ArchitectureReal-time Speech Recognition

0 likes · 38 min read

How to Turn FunASR into a Production‑Ready Real‑Time Speech Platform: From Single‑Node Demo to Million‑Scale Architecture

Meituan Technology Team

Apr 13, 2023 · Artificial Intelligence

Peak-First Regularization for Low-Latency Streaming Speech Recognition

The paper presents a low‑latency streaming speech‑recognition solution that reframes latency reduction as a knowledge‑distillation task, using a simple peak‑first regularization term to shift CTC output probabilities leftward and achieve up to 200 ms average latency reduction without harming word error rate.

CTCLatency ReductionPeak-First Regularization

0 likes · 21 min read

Peak-First Regularization for Low-Latency Streaming Speech Recognition

ITPUB

Jul 25, 2022 · Artificial Intelligence

How an AI Interview Bot Scaled 20× Faster with Backend Architecture Optimizations

This article details the design of an AI interview robot for 58.com, covering its backend architecture, dialogue engine, resource‑management strategies, performance‑testing methodology, and the optimizations that boosted concurrent interview capacity by twenty times while improving user experience.

AI Interview BotDialogue EngineRTP

0 likes · 17 min read

How an AI Interview Bot Scaled 20× Faster with Backend Architecture Optimizations