How to Design a Billion‑User Real‑Time Step Leaderboard for Interviews

This article breaks down the interview‑level system design of a WeChat‑style step leaderboard that must support over a billion users, handling massive write spikes, low‑latency friend ranking queries, storage scaling, and relationship complexity with a three‑part architecture using MQ, Redis, and MySQL.

ITPUB
ITPUB
ITPUB
How to Design a Billion‑User Real‑Time Step Leaderboard for Interviews

Why This Question Trips Up Candidates

The problem combines four classic challenges: massive concurrent writes ("write tsunami"), real‑time ranking queries ("query nightmare"), huge storage requirements ("storage black hole"), and complex friend relationships ("relationship maze"). A naïve MySQL solution fails on all fronts.

First Pillar – Write Path: Async Decoupling and Peak‑Shaving

Use a message queue (Kafka or RocketMQ) as a buffer. The Java service receives the step report, does no computation or DB write, and immediately pushes a {userId, steps, timestamp} message to the queue, returning success to the app.

A downstream consumer pulls messages at its own pace, updates a Redis ZSET leaderboard, e.g.:

# Add user A's steps to today's leaderboard
ZADD leaderboard:2025-09-12 15000 user_id_A

The ZADD command overwrites old scores with O(log N) complexity, providing fast, atomic ranking updates.

Second Pillar – Query Path: Cold‑Hot Separation and In‑Memory Computation

Separate static relationship data ("cold") from dynamic step data ("hot").

Cold: Retrieve the user's friend‑ID list from a sharded MySQL cluster (reliable, low‑frequency updates).

Hot: Use Redis Pipeline or multithreaded batch reads to fetch all friends' scores in a single round‑trip, avoiding 200 individual calls.

After obtaining the scores, sort the small list in the service memory and enrich it with user profile info before returning to the client.

Third Pillar – Handling Hot Users and Ensuring Reliability

For users with millions of friends, pre‑compute their rankings periodically (e.g., every minute) and cache the full sorted list in a dedicated Redis key. When the hot user queries, serve the cached result instantly.

Reliability is achieved with dual insurance:

Redis runs in master‑slave + Sentinel mode for automatic failover.

All step messages remain in the MQ for days, allowing a recovery job to replay data and rebuild the leaderboard if the entire Redis cluster fails.

Summary – The Four‑Step Playbook

Async decoupling: use MQ to absorb write spikes.

Cold‑hot separation: MySQL for static relationships, Redis for real‑time scores.

Cold‑hot (query) separation: fetch relationships first, then batch‑read scores, sort in memory.

Reliability: HA Redis + MQ‑based replay for disaster recovery.

Mastering this architecture demonstrates a clear, scalable solution that impresses interviewers.

Redissystem designKafkaHigh ConcurrencyLeaderboard
ITPUB
Written by

ITPUB

Official ITPUB account sharing technical insights, community news, and exciting events.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.