Artificial Intelligence 12 min read

Key Technical Innovations in Kuaishou’s “Kuaiyi” Large Model and Its Real-World Applications

The article details Kuaishou’s development of the 175B “Kuaiyi” multimodal large model, presenting eight novel technical innovations—from Temporal Scaling Law and MiLe Loss to MoE‑enhanced reward modeling—and describes how these advances enable high‑performance AI services such as the AI Xiao Kuai chatbot across diverse real‑world scenarios.

Kuaishou Tech
Kuaishou Tech
Kuaishou Tech
Key Technical Innovations in Kuaishou’s “Kuaiyi” Large Model and Its Real-World Applications

In June 2024, Kuaishou presented at GAITC 2024 the development of its “Kuaiyi” 175B multimodal large model, which surpasses GPT‑3.5 and approaches GPT‑4 in various benchmarks.

The paper outlines eight core technical innovations: (1) Temporal Scaling Law for low‑cost hyper‑parameter search directly on large models; (2) MiLe Loss, an information‑entropy‑weighted cross‑entropy loss; (3) Scaffold‑BPE, a token‑frequency‑aware vocabulary learning method; (4) a negative‑feedback mechanism in supervised fine‑tuning; (5) a parallel token‑unit decoding strategy that speeds up inference by ~30%; (6) MoE‑enhanced Reward Model for better alignment; (7) iterative RLHF + RLAIF pipeline; (8) adaptive MoE routing error detection and loss optimization.

Each innovation is described with its motivation, formulation, and experimental results showing significant performance gains.

The model has been deployed in several Kuaishou products, most notably the “AI Xiao Kuai” emotional companion chatbot, which required advances in multimodal understanding, persona‑fine‑tuning, long‑turn dialogue, and tool‑calling capabilities.

Challenges such as high‑quality video captioning, maintaining engaging and empathetic conversations, extending dialogue length beyond 200 turns, and integrating diverse tool functions were addressed through dense video captioning, persona models (KwaiYii‑Role), a user‑simulator (KwaiYii‑Parrot), and function‑calling/retrieval‑augmented generation.

Overall, the rapid scaling from 13B to 175B and the successful real‑world applications demonstrate the effectiveness of Kuaishou’s research and its potential for future large‑model innovations.

multimodal AImodel optimizationAI applicationslarge language modelscaling lawreinforcement learning
Kuaishou Tech
Written by

Kuaishou Tech

Official Kuaishou tech account, providing real-time updates on the latest Kuaishou technology practices.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.