Artificial Intelligence 20 min read

Design and Evolution of a Scalable Danmaku Personalized Recommendation System

The paper describes how Bilibili transformed its danmaku service from a simple, limited‑recall pipeline into a ten‑fold larger, KV‑store‑backed recommendation architecture that unifies engineering and AI layers, uses dynamic sharding and Redis locks, and ultimately boosts recall pool size, exposure, and experiment speed while reducing downgrade rates.

Bilibili Tech
Bilibili Tech
Bilibili Tech
Design and Evolution of a Scalable Danmaku Personalized Recommendation System

Background

Bullet‑screen (danmaku) services have evolved through three stages: basic capability, negative governance, and positive recommendation. After stabilizing the first two stages, Bilibili needed a personalized recommendation layer to select high‑quality danmaku for display, requiring integration of user features and a large recall pool.

Stage 1: Minimal Recommendation on the Original Architecture

The initial solution added a simple recommendation flow: danmaku senders write to a database whose binlog is streamed to the recommendation system; the recommendation system computes a list of danmaku IDs for each video, and the engineering system fetches the content for display.

Problems of Stage 1

• Limited recall pool (e.g., a 15‑minute video capped at 6,000 danmaku) leads to sparse screens. • Low quality because the pool is small and time‑ordered eviction discards valuable historic danmaku. • Uneven distribution causes empty screens in long videos.

Stage 2: Ten‑fold Expansion of the Recall Pool

A dedicated engineering system for personalized recommendation was built, replacing the original danmaku pool. The new system uses a KV store (Taishan) to keep per‑video, per‑minute recall data, and each danmaku is also stored by its ID. Redis distributed locks guarantee consistency. This design supports millions of QPS without additional caching.

Storage Design

The KV store holds only the data needed for recommendation (no full‑danmaku storage), allowing direct reads with high concurrency. The previous three‑level cache (interface cache → second‑level cache → third‑level cache) was removed.

Computation Optimizations

Recall pools are refreshed in 10‑second granularity, supporting up to 1,000 danmaku per 10 seconds. Full‑pool back‑fill (hundreds of billions of danmaku) takes about two days; incremental updates use message queues and Redis locks to keep consistency.

Shard Granularity

Dynamic shard sizes balance bandwidth and QPS: early versions used 6‑minute shards, later adjusted per scenario. Storage shards are 10 seconds to align with the recall strategy.

Dual‑System Relationship

The original system (TiDB‑backed) remains the source of truth for all danmaku, while the new KV‑based pool is a subset used only for personalized recommendation. In case of failure, the system degrades gracefully to the original pipeline.

Stage 3: Deep Integration of Engineering and Recommendation

Key issues identified:

Data misalignment between engineering and AI pipelines.

Frequent AI‑side degradation due to heavy model loading and lack of real‑time eviction.

Insufficient experiment speed; full back‑fill of billions of records takes days.

Solutions include merging material and index pools, pre‑eviction in coarse ranking, and enabling hour‑level model updates.

Detailed Design Highlights

• Unified material and index pools stored in the same KV database with video‑minute keys, protected by Redis shard locks. • Three back‑fill paths: incremental updates, index back‑fill, and material back‑fill, reducing data write volume by ~50%. • Model versioning fields allow safe rollout and rollback of scoring models. • Hot‑video handling moves eviction logic to coarse ranking, reducing memory pressure in fine‑ranking. • Experiments now cover 90% of view volume with only 15% of data recomputation, enabling sub‑hour strategy iteration.

Benefits and Outlook

The integrated system increased the recall pool by roughly tenfold (e.g., from 6,000 to ~90,000 danmaku for a 15‑minute video), raising exposure by ~30% and improving user experience. Fine‑ranking downgrade rate dropped from 3% to 0.1%, and full‑stack experiments can finish within 10 hours, with small‑scale iterations in under an hour. Future work will focus on faster strategy iteration, real‑time feature availability, and continued stability improvements.

backend engineeringsystem architecturepersonalized recommendationAI integrationScalable storagedanmaku
Bilibili Tech
Written by

Bilibili Tech

Provides introductions and tutorials on Bilibili-related technologies.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.