Backend Storage Design for Timeline Feed Streams: Read Diffusion, Write Diffusion, and Hybrid Approaches
Backend storage for timeline feeds can use read‑diffusion (pull), write‑diffusion (push), or a hybrid mix, each with trade‑offs in read/write load, scalability, and storage, while pagination should rely on last_id and snapshots for state‑driven ordering such as live‑streaming feeds.
Feed streams are ubiquitous in mobile apps (e.g., WeChat Moments, Weibo, Toutiao). A timeline‑based feed is usually ordered by the posting time of the users one follows. This article summarizes common backend storage designs for such feeds and discusses how to choose a flexible solution based on concrete business scenarios.
1. Background
Two typical feed types exist: algorithm‑recommended and follow‑based. The focus here is on the follow‑based feed and its backend implementation.
2. Feed‑flow implementation schemes
Scheme 1 – Read Diffusion (Pull Model)
Each content creator has an outbox. When a reader requests the feed, the system fetches the list of followees, iterates over their outboxes, merges the posts, and sorts them by timestamp. This results in one read request for the reader but N read operations on the storage (N = number of followees). Advantages: simple storage, no data duplication. Disadvantages: heavy read load, poor scalability when followee count is large, and pagination is difficult.
Scheme 2 – Write Diffusion (Push Model)
When a user posts, the system writes the post to the author’s outbox and also pushes a copy to the inbox of each follower. Readers then read directly from their own inboxes. This turns one read into many writes (M = number of followers). Advantages: fast reads, simple pagination. Disadvantages: high write amplification, storage waste, and infeasible for users with massive follower counts (e.g., celebrities).
Scheme 3 – Hybrid (Read‑Write Mix)
Combines the strengths of both models: read diffusion for scenarios with few followees, write diffusion for active users with many followers. The article provides a comparison table of pros, cons, and suitable scenarios.
4. Pagination Issue
Traditional page_number/page_size pagination suffers from data drift when new posts appear between page requests. The recommended solution is to use last_id (the ID of the last item on the previous page) instead of page numbers, ensuring stable offsets. The article also discusses the need for soft deletes to keep last_id resolvable.
5. Real‑world Business Application – Live‑Streaming Feed
The author describes a live‑streaming product where each broadcast has three states (upcoming, live, replay). The feed ordering rules are:
Live streams first, then upcoming, then replay.
Within live streams, order by start time descending.
Within upcoming streams, order by scheduled start time ascending.
Within replay streams, order by end time descending.
Because state changes affect ordering, a pure write‑diffusion approach is unsuitable for the “upcoming” and “live” states. The solution mixes read diffusion for those states and write diffusion for the replay state.
To handle pagination across state changes, the system creates a snapshot of live and upcoming items when the first page is requested (identified by session_id ). Subsequent pages read from the snapshot until exhausted, then fall back to the replay queue. The request parameters include session_id , last_id , state , and page_size .
6. Summary
Read diffusion, write diffusion, and hybrid models cover most timeline‑based feed designs. Real‑world scenarios may introduce additional complexities such as state‑driven ordering, advertisements, or special‑interest feeds, which require tailored adaptations.
Tencent Cloud Developer
Official Tencent Cloud community account that brings together developers, shares practical tech insights, and fosters an influential tech exchange community.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.