Designing and Optimizing a High‑Concurrency Counting System for Large‑Scale Social Media
This article explains the challenges of storing and serving trillion‑level count data for a massive micro‑blogging platform and presents a step‑by‑step backend design that evolves from simple MySQL tables to sharded databases, Redis caching, message‑queue write‑sharding, and memory‑efficient custom data structures to achieve high performance, low cost, and strong consistency.
In the context of a large micro‑blogging service, count metrics such as reposts, comments, likes, and views reach trillions in volume and millions of queries per second, demanding a backend that can handle massive data size, high read/write concurrency, and strict accuracy.
Business characteristics of the counts include:
Huge data volume: billions of posts and over a billion users generate count values in the trillions.
Massive access traffic: daily active users exceed 200 million, leading to read rates of millions of queries per second.
Strong consistency requirements: users are highly sensitive to count fluctuations.
Initial design used a single MySQL table with columns for each metric, e.g.:
select repost_count, comment_count, praise_count, view_count from t_weibo_count where weibo_id = ?While simple, this approach does not scale when the table grows to tens of millions of rows.
Scaling with sharding introduced two strategies:
Hash‑based sharding on weibo_id .
Time‑based sharding using the timestamp embedded in the ID.
Hash sharding was chosen to avoid hotspot imbalance.
Read‑side acceleration moved from MySQL to a Redis cluster with hash‑based partitioning, but pure cache‑plus‑DB introduced consistency gaps, so the system eventually migrated to Redis as the primary store.
Write‑side pressure reduction employed a message queue to batch updates. Example SQL updates:
UPDATE t_weibo_count SET repost_count = repost_count + 1 WHERE weibo_id = 1;
UPDATE t_weibo_count SET repost_count = repost_count + 1 WHERE weibo_id = 1;
UPDATE t_weibo_count SET repost_count = repost_count + 1 WHERE weibo_id = 1;These can be merged into a single statement:
UPDATE t_weibo_count SET repost_count = repost_count + 3 WHERE weibo_id = 1;Storage cost reduction involved customizing Redis:
Store keys as raw 8‑byte longs instead of strings, saving ~20 bytes per key.
Eliminate unnecessary pointers, reducing a KV entry to 12 bytes (8‑byte ID + 4‑byte count).
Further memory savings were achieved by packing multiple counters for the same post into a single record, cutting the per‑post overhead from 48 bytes to 12 bytes.
Because recent posts are hot, a hybrid approach stores recent counters in memory and dumps older data to SSD, loading cold data on demand via a separate I/O thread and a “Cold Cache”.
After applying these optimizations—sharding, Redis‑only storage, batch writes, custom data structures, and tiered caching—the counting service can sustain high concurrency, massive data volume, and the required accuracy while keeping storage costs manageable.
Overall, the case study demonstrates that understanding system pain points (e.g., storage cost) and iteratively applying targeted backend engineering techniques leads to a robust, scalable counting infrastructure.
Architecture Digest
Focusing on Java backend development, covering application architecture from top-tier internet companies (high availability, high performance, high stability), big data, machine learning, Java architecture, and other popular fields.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.