Artificial Intelligence 12 min read

Key Technical Innovations in Kuaishou’s “Kuaiyi” Large Model and Its Real-World Applications

The article details Kuaishou’s development of the 175B “Kuaiyi” multimodal large model, presenting eight novel technical innovations—from Temporal Scaling Law and MiLe Loss to MoE‑enhanced reward modeling—and describes how these advances enable high‑performance AI services such as the AI Xiao Kuai chatbot across diverse real‑world scenarios.

Kuaishou Tech

Jul 17, 2024

Key Technical Innovations in Kuaishou’s “Kuaiyi” Large Model and Its Real-World Applications

In June 2024, Kuaishou presented at GAITC 2024 the development of its “Kuaiyi” 175B multimodal large model, which surpasses GPT‑3.5 and approaches GPT‑4 in various benchmarks.

The paper outlines eight core technical innovations: (1) Temporal Scaling Law for low‑cost hyper‑parameter search directly on large models; (2) MiLe Loss, an information‑entropy‑weighted cross‑entropy loss; (3) Scaffold‑BPE, a token‑frequency‑aware vocabulary learning method; (4) a negative‑feedback mechanism in supervised fine‑tuning; (5) a parallel token‑unit decoding strategy that speeds up inference by ~30%; (6) MoE‑enhanced Reward Model for better alignment; (7) iterative RLHF + RLAIF pipeline; (8) adaptive MoE routing error detection and loss optimization.

Each innovation is described with its motivation, formulation, and experimental results showing significant performance gains.

The model has been deployed in several Kuaishou products, most notably the “AI Xiao Kuai” emotional companion chatbot, which required advances in multimodal understanding, persona‑fine‑tuning, long‑turn dialogue, and tool‑calling capabilities.

Challenges such as high‑quality video captioning, maintaining engaging and empathetic conversations, extending dialogue length beyond 200 turns, and integrating diverse tool functions were addressed through dense video captioning, persona models (KwaiYii‑Role), a user‑simulator (KwaiYii‑Parrot), and function‑calling/retrieval‑augmented generation.

Overall, the rapid scaling from 13B to 175B and the successful real‑world applications demonstrate the effectiveness of Kuaishou’s research and its potential for future large‑model innovations.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Multimodal AI Model Optimization AI applications large language model scaling law reinforcement learning

Written by

Kuaishou Tech

Official Kuaishou tech account, providing real-time updates on the latest Kuaishou technology practices.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.