Hybrid Push‑Pull Timeline Architecture: Scaling Social Feeds for Billions
To serve billions of users with real‑time timelines, modern social platforms combine push‑based delivery for regular users and pull‑based retrieval for high‑profile accounts, employing hot‑cold separation, Kafka fan‑out, Redis caching, and scalable storage strategies to balance write and read loads.
1. Push vs. Pull Dilemma
Social‑media timelines must deliver new posts to billions of users quickly. A pure push model writes a post to every follower’s inbox at publish time, giving instant reads but causing massive write amplification for high‑profile users. A pure pull model reads a follower’s timeline by aggregating all followees’ outboxes at read time, keeping writes light but incurring heavy read amplification and latency.
Neither extreme scales for “big‑V + high concurrency” workloads; modern platforms adopt a hybrid push‑pull architecture with hot‑cold separation.
2. Hybrid Push‑Pull Architecture
Normal Users – Push Mode
Publish: User posts → system asynchronously pushes the post to each follower’s inbox.
Read: Followers read directly from their inboxes with millisecond latency.
Reasoning: Most users have a limited fan‑out (hundreds), so write pressure remains manageable.
High‑Profile Users (Big V) – Pull Mode
Publish: The post is stored only in the author’s outbox.
Read: When a follower refreshes, the system:
Loads normal‑user pushes from the follower’s inbox.
Pulls recent posts from the outboxes of followed big‑V accounts.
Merges the two streams and sorts by timestamp.
Benefit: Avoids write storms caused by millions of followers.
Challenge: Requires efficient aggregation and caching at read time.
3. Hot & Cold Timeline Separation
Hot Users (Active)
Definition: Users active within the last seven days.
Mechanism: All publish events are sent to a Kafka topic and asynchronously fan‑out to the inboxes of active followers.
Result: Reads complete in milliseconds.
Cold Users (Inactive)
No persistent inbox is maintained.
On refresh, the system pulls the latest posts from followees’ outboxes, merges, and sorts in real time.
Low access frequency keeps the impact on system load negligible.
Special Handling for Big V
Posts are written to Kafka topic timeline_fanout.
Backend workers use delayed queues and batch processing to push updates to active followers in chunks, smoothing peak loads.
Inactive followers still retrieve content via the pull path.
4. Storage and Index Design
Outbox (User‑sent Posts)
Key: user_id:outbox Structure: List or Sorted Set ordered by timestamp (descending).
Purpose: Stores IDs of posts authored by the user.
Index: Timestamp + weight.
Implementation: Redis for hot data, MySQL for cold archival.
Inbox (Received Posts)
Key: user_id:inbox Structure: Sorted Set sorted by publish time or popularity.
Limit: Retains only the most recent N items (e.g., 5,000).
Features: High‑speed read cache with asynchronous expiration.
5. Timeline Merging and Sorting Algorithm
The user’s feed is built from two sources: the inbox (push data) and the outboxes of followed big‑V accounts (pull data).
1. Retrieve the latest N post IDs from the inbox.
2. For each followed big‑V, fetch M recent post IDs from its outbox.
3. Insert all IDs into a min‑heap keyed by timestamp or weight.
4. Pop the top K entries from the heap to produce the homepage feed.This logic is often referred to as “fan‑out‑on‑read” and can be parallelized as a lightweight streaming computation.
6. Relationship Cache and Asynchronous Distribution
Relationship Cache: user_id → follow_list stored in Redis or a graph database.
Asynchronous Distribution Pipeline:
Publisher → Kafka → Consumer workers.
Each worker batches follower retrieval and pushes posts to their inboxes.
Writes are throttled, batched, and retried to guarantee stability.
When a big‑V posts, the system does not write to millions of inboxes immediately; it performs batch asynchronous pushes with delayed back‑fill to smooth spikes.
7. Representative Performance Metrics
Daily Active Users: >100 M
Number of Big V users: ~10 k (each with >1 M followers)
Peak publish QPS: ~50 k requests/second
Timeline query latency for hot users: ~80 ms
Big V fan‑out propagation delay: <2 s (Kafka async)
Redis cache hit rate: >95 %
8. Design Principles and Optimization Ideas
Asynchronous Decoupling: Separate publishing from fan‑out; Kafka smooths traffic spikes.
Layered Caching: Redis for hot storage, MySQL/Elasticsearch for cold archiving.
Eventual Consistency: Accept microsecond‑level staleness for overall system stability.
Dynamic Sharding: Partition data based on follower count to increase write concurrency.
Intelligent Ranking: Combine time, popularity, and interaction weight for ordering.
Gray‑scale Push: Phase‑wise delivery for big‑V posts to avoid avalanche effects.
9. Illustration
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Ray's Galactic Tech
Practice together, never alone. We cover programming languages, development tools, learning methods, and pitfall notes. We simplify complex topics, guiding you from beginner to advanced. Weekly practical content—let's grow together!
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
