Jan 30, 2026 · Artificial Intelligence

Human‑like LLM Replies for Live Digital Hosts: ASR‑Based Style Transfer and Reward Modeling

This article proposes an ASR‑driven pipeline that creates high‑quality AI‑reply vs. human‑like reply pairs, trains a rewrite model and a reward model, and uses GRPO reinforcement learning to generate natural, helpful, and less AI‑sounding responses in digital‑human live streaming, achieving 92% accuracy and 97% helpfulness while improving user experience.

ASR dataLLMQwen

0 likes · 20 min read

Human‑like LLM Replies for Live Digital Hosts: ASR‑Based Style Transfer and Reward Modeling