Designing a Billion‑User Social Graph for 100W QPS on JD
The article presents a complete backend design for JD's social‑relationship scenario handling 100 W QPS, covering data modeling, MySQL sharding, Redis structures, multi‑level caching, big‑key mitigation, write‑read path optimization, pagination, consistency guarantees, and query strategies for both "following" and "followers" lists.
Overall Architecture
Target: 100W QPS for hundreds of millions of users. The design combines MySQL horizontal sharding (16 databases × 1024 tables), a Redis cluster (initially 16 shards, later expanded to 32), and a three‑level cache (local Caffeine, Redis, MySQL) to achieve sub‑millisecond latency.
Core Data Model
Relationship semantics
Direction: follow, fan, bidirectional – stored as two redundant rows to avoid joins.
Status: active, cancelled, blocked – soft delete, keep history.
Time: create_time, update_time – supports temporal queries.
MySQL table
CREATE TABLE user_relation_ (
user_id BIGINT COMMENT '用户A',
target_id BIGINT COMMENT '用户B',
relation_type TINYINT COMMENT '1 A→B 2 B→A 3 双向',
state TINYINT COMMENT '0取消 1正常 2拉黑',
create_time DATETIME,
PRIMARY KEY (user_id, target_id),
KEY idx_target(target_id)
) ENGINE=InnoDB
PARTITION BY HASH(user_id) PARTITIONS 1024;Redis structures
following:{userId} – Set of targetIds, never expires.
followers:{userId} – Set of fanIds, never expires.
cnt:following:{userId} – String count, never expires, async sync.
cnt:followers:{userId} – String count, same.
Concurrent Write Path (Follow Example)
Atomic update via a Redis Lua script:
String lua = "if redis.call('SISMEMBER', KEYS[1], ARGV[1]) == 1 then " +
"return 0 " +
"else " +
"redis.call('SADD', KEYS[1], ARGV[1]); " +
"redis.call('SADD', KEYS[2], ARGV[2]); " +
"redis.call('INCR', KEYS[3]); " +
"redis.call('INCR', KEYS[4]); " +
"return 1 " +
"end";
Long ok = redis.execute(lua,
Arrays.asList("following:"+userId, "followers:"+targetId,
"cnt:following:"+userId, "cnt:followers:"+targetId),
targetId.toString(), userId.toString());
if (ok == 1) {
rocketMQ.syncSend("follow-topic", new FollowEvent(userId, targetId, 1));
}Concurrent Read Path
Cache‑penetration protection
// Bloom filter pre‑load all userIds (~1.2 GB for 10 B bits)
if (!bloomFilter.mightContain(targetId)) {
return false; // illegal user
}Three‑level cache
L1: Local Caffeine, hit≈70 %, 5 s TTL for hot V users.
L2: Redis, hit≈99 %.
L3: MySQL, 100 % fallback via binlog replay.
Capacity & Sharding Strategy
Redis
16‑shard cluster, each shard 64 GB → total 1 TB.
When a V‑user exceeds 10 k fans, split into followers:{userId}:{shardId} where shardId = targetId % 32.
MySQL
16 databases × 1024 tables = 16 384 tables.
Each table < 50 M rows, index depth ≤ 3.
Data older than 6 months migrated to TiDB/ODPS.
BigKey Risk & Mitigation
Detection
// Sample every 10 min
if (redis.llen("followers:" + userId) > 10000) { alertBigKey(userId); }Mitigation steps
Hash‑based horizontal splitting into 32 buckets.
Read‑write routing per shard.
Asynchronous migration of hot keys.
Secondary dispersion to 256 micro‑buckets when needed.
Effect validation
Key size: 1 key 10 k elements → 32 keys ≈ 310 elements each.
Single‑call block: 50 ms → < 1 ms.
Failure rate: 5 % → 0.1 %.
High‑Concurrency Attack Plan (100 W QPS)
Read‑Write Separation
Write path: Redis master → RocketMQ (async) → MySQL master.
Read path: Local cache → Redis slave → MySQL slave (fallback).
Redis Cluster Scaling
Expand shards from 16 to 32, reducing per‑shard QPS from 6.25 W to 3.125 W.
Isolate hot V users into 8 dedicated shards.
Configure each master with 2 slaves for read off‑loading.
Batch Request Optimization
Batch follow‑status checks (≤20 targets) via pipelined SISMEMBER, and batch MySQL writes (≥100 events) in a consumer.
public List<Boolean> batchCheckFollow(Long userId, List<Long> targetIds) {
List<Object> results = redis.pipelined(p -> {
targetIds.forEach(t -> p.sIsMember("following:" + userId, t.toString()));
}).get();
return results.stream()
.map(res -> res != null && (Boolean) res)
.collect(Collectors.toList());
} @RabbitListener(queues = "follow-event-queue")
public void batchProcessFollowEvent(List<FollowEvent> events) {
Map<Long, List<FollowEvent>> groups = events.stream()
.collect(Collectors.groupingBy(FollowEvent::getUserId));
groups.forEach((uid, list) -> {
List<FollowDO> dos = list.stream()
.map(e -> new FollowDO(e.getUserId(), e.getTargetId(), e.getRelationType()))
.collect(Collectors.toList());
if (dos.size() >= 100) {
userRelationMapper.batchInsert(dos);
dos.clear();
}
});
}Multi‑Level Cache Upgrade (support 50 W QPS)
L1: Caffeine (consistent‑hash) – hot V follow status/list, TTL 30 s, target hit 40 %.
L2: Redis slave – normal user data, permanent, target hit 29.5 %.
L3: MySQL slave – degraded‑scenario data, target hit 0.5 %.
Write‑Side Throttling & Asynchrony
Front‑end updates Redis atomically then pushes a RocketMQ message; the consumer batches 100 events or 100 ms intervals before writing to MySQL.
public boolean asyncFollow(Long userId, Long targetId) {
String lua = "if redis.call('SISMEMBER', KEYS[1], ARGV[1]) == 1 then return 0 else " +
"redis.call('SADD', KEYS[1], ARGV[1]); " +
"redis.call('SADD', KEYS[2], ARGV[2]); " +
"redis.call('INCR', KEYS[3]); " +
"redis.call('INCR', KEYS[4]); " +
"return 1 end";
Long result = redis.execute(lua,
Arrays.asList("following:"+userId, "followers:"+targetId,
"cnt:following:"+userId, "cnt:followers:"+targetId),
targetId.toString(), userId.toString());
if (result == 1) {
FollowEvent ev = new FollowEvent(userId, targetId, 1);
rocketMQ.send("follow-topic", MessageSelector.byTag("userId:"+(userId%100)), ev);
return true;
}
return false;
}Querying "My Follow" and "Followers"
Core model recap
following:{userId} – Set of targetIds.
followers:{userId} – Set of fanIds.
Query design
My Follow List: read following:{userId} from Redis, fallback to MySQL.
Followers List: read followers:{userId} (or its sharded keys) and merge.
Pagination & Cursor
// Cursor fields
lastId = last returned max userId
size = page size (≤50)Convert the Redis Set to a sorted ZSet by follow timestamp; the timestamp serves as a cursor supporting forward and backward paging.
Consistency & Degradation
Cache miss → async back‑fill from MySQL → write back to Redis for next hit.
Incremental binlog sync keeps Redis within <1 s of MySQL.
Followers of Big V Users (Post‑Sharding)
After splitting a big V's follower set into 32 shards, a query must broadcast SMEMBERS to all shards, merge results, and paginate.
Sharded key space
followers:{userId}:0 // fanId % 32 = 0
followers:{userId}:1 // fanId % 32 = 1
...
followers:{userId}:31Parallel retrieval
// Use CompletableFuture to issue 32 SMEMBERS concurrently; total latency ≈ single RTT (<1 ms).Merge & Pagination
// Stream‑merge the 32 sets, sort by fanId (or timestamp), then slice by cursor.Hole filling
If a shard returns fewer elements than the page size, fetch the missing rows from MySQL, write them back to the corresponding Redis shard, and include them in the current response. Subsequent pages will no longer have the hole.
Summary
The design combines redundant MySQL records, hash‑based Redis sharding, three‑level caching, Bloom‑filter protection, and asynchronous write pipelines to achieve sub‑millisecond latency for follower queries and sustain 100 W QPS. BigKey risks are eliminated through hash splitting, micro‑bucket dispersion, and proactive hole‑filling, ensuring stable performance even for billion‑scale social graphs.
Tech Freedom Circle
Crazy Maker Circle (Tech Freedom Architecture Circle): a community of tech enthusiasts, experts, and high‑performance fans. Many top‑level masters, architects, and hobbyists have achieved tech freedom; another wave of go‑getters are hustling hard toward tech freedom.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
