Backend Development 15 min read

Design and Evolution of Feed Stream Architecture for High‑Throughput Applications

This article analyzes the business requirements, technical challenges, and mainstream architectural solutions for large‑scale feed streams, and proposes a step‑by‑step evolution path—from a simple push model using cloud Kafka and HBase to hybrid push‑pull and recommendation‑driven designs—suitable for startups and rapidly growing platforms.

Big Data Technology & Architecture

Sep 23, 2019

Design and Evolution of Feed Stream Architecture for High‑Throughput Applications

Background – Feed streams (information flows) connect content producers and consumers in scenarios such as e‑commerce homepages, social circles, micro‑blogs, and news feeds. The author, a big‑data engineer involved in Alibaba's Taobao and Weitao feed storage services, shares insights from building HBase/Lindorm‑based storage for popular media platforms.

Business Analysis – Three participants (information producers, consumers, and the platform) have distinct needs: rich content formats, low latency, reliability, high engagement, and monetization. Feed delivery can be relationship‑based or recommendation‑based; the latter relies on user and item profiling.

Technical Challenges – Real‑world feed systems handle billions of daily messages, QPS in the millions, and require sub‑second response times, multi‑level caching, high availability, and cost‑effectiveness.

Mainstream Architecture

1. Message and Relationship Storage – Messages are semi‑structured (text, images, video, metadata). The author recommends HBase for its support of structured and semi‑structured data, high write throughput, smooth horizontal scaling, block cache for millisecond reads, TTL for data expiration, and hot‑cold data separation.

2. Push vs. Pull vs. Hybrid – Push copies a message to each follower’s inbox, offering low latency but high storage cost; pull reads from producers’ outboxes, requiring sophisticated caching. A hybrid approach combines push for active users and pull for inactive ones, reducing waste.

3. Recommendation‑Driven Delivery – Adds user/item profiling stored in HBase, a temporary inbox for filtering low‑quality content, and search capabilities (Solr/Elasticsearch) for both inbox and outbox.

Iteration Path for Start‑ups

Initial stage: cloud Kafka + cloud HBase, pure push, asynchronous processing, message split into body and index.

Mid‑stage: add multi‑level queues to handle “big‑V” traffic spikes, hot‑cold data separation, and secondary indexes for inbox search.

Growth stage: adopt push‑pull hybrid, integrate recommendation system, use cloud Kafka + cloud HBase + cloud Redis, and enhance outbox caching and inbox secondary indexing.

Conclusion – Feed streams are ubiquitous across e‑commerce, social, and media apps. The article summarizes business scenarios, mainstream architectures, and technical bottlenecks, and provides a concrete cloud‑native evolution roadmap. Ongoing challenges include handling inactive accounts, reducing storage cost, and improving search, which will drive further optimization of cloud HBase for feed workloads.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

backend recommendation scalability Kafka HBase feed

Written by

Big Data Technology & Architecture

Wang Zhiwu, a big data expert, dedicated to sharing big data technology.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.