Backend Development 7 min read

Optimizing Social Media Feeds: Push vs Pull and Time‑Partitioned Pull Strategy

This article examines the push and pull models used by micro‑blogging platforms such as Twitter and Sina Weibo, analyzes their scalability challenges, and proposes a time‑partitioned pull approach that reduces database load while maintaining fast feed retrieval for active users.

21CTO

Dec 14, 2015

Optimizing Social Media Feeds: Push vs Pull and Time‑Partitioned Pull Strategy

Social networking services such as Twitter, Sina Weibo, and Renren use a feed system where each post (a “feed”) must be delivered to followers. This article discusses the traditional push and pull architectures and introduces a time‑partitioned pull model.

In the push model, when a user posts a micro‑blog, the system writes a copy of that post into the feed tables of all followers. For a celebrity with millions of followers, this creates millions of rows per post, leading to huge storage and write‑amplification.

The pull model stores each new post only once in a central feed table. When a user requests their timeline, the system queries the feed table for the IDs of posts from the users they follow, often using a cache such as Memcached. While this reduces write load, the feed table can become a bottleneck under heavy read traffic, especially for users with many followees.

To improve the pull model, the article proposes a time‑partitioned pull strategy. The feed table is divided into partitions based on time intervals (e.g., the last hour, the last day, longer periods). When a user logs in, the system first checks the most recent partition; subsequent requests only need to query the partition that matches the last known timeline, dramatically reducing the amount of data scanned.

This approach leverages the observation that active users frequently access recent data, so most queries hit small, recent partitions, while infrequent users may need to scan larger, older partitions only occasionally. The partitioning scheme can be tuned based on data volume and access patterns.

The time‑partitioned pull model can be combined with push techniques for certain scenarios, offering a cost‑effective solution that balances write and read performance.

Original article: http://www.cnblogs.com/sunli/archive/2010/08/24/twitter_feeds_push_pull.html

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

scalability feed architecture push-pull microblogging time partition

Written by

21CTO

21CTO (21CTO.com) offers developers community, training, and services, making it your go‑to learning and service platform.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.