How Lazada Scaled Real‑Time Product Selection with Flink & Hologres
Lazada transformed its e‑commerce product selection by building a unified, real‑time platform on Alibaba Cloud Flink and Hologres, overcoming data silos, freshness delays, and high‑throughput challenges to enable millisecond‑level decisions across six Southeast Asian markets.
Introduction
In the fast‑changing e‑commerce landscape, Lazada faces the challenge of managing billions of SKUs across six Southeast Asian markets and delivering personalized recommendations in milliseconds. To meet this, Lazada built an end‑to‑end real‑time product selection platform using Alibaba Cloud Flink and Hologres.
Business and Technical Challenges
Data silos across supply, 1P, cross‑border and marketplace teams caused inefficient integration. Freshness was a problem: new product or attribute updates could take more than a day to propagate, making real‑time decisions impossible during flash sales. The legacy architecture also lacked the throughput and flexibility needed for high TPS and heterogeneous data sources.
Architecture Solution: Flink + Hologres
The solution adopts a layered architecture: a data service layer, a selection layer that groups SKUs, and an analysis layer for real‑time insights. Flink provides stream processing and CDC, while Hologres offers a high‑performance analytical store with million‑TPS write capability. The platform integrates with existing databases, Kafka, MetaQ, MaxCompute, and supports both batch and streaming workloads.
Batch data from MaxCompute is synchronized to Hologres tables; incremental data from MySQL, Kafka and hourly MaxCompute updates are ingested via Flink into an intermediate incremental table, enabling efficient binlog replay and disaster recovery.
Data Architecture Details
Data is split into three freshness tiers: offline batch (≈70% of metrics, T+1 day), hourly batch (T+1 hour) and real‑time stream (T+1 second). Real‑time streams include user behavior, orders, GMV, and price changes, which must be reflected instantly for accurate selection.
Key Technical Innovations
Roaring Bitmap for Tag Management
By replacing array‑based tag storage with Roaring Bitmap, tag storage reduced by 40 %, CPU usage by 30 % and tag‑related query latency by 90 %.
JSON‑B for Semi‑structured Data
Hologres JSON‑B allows flexible storage of semi‑structured fields such as social comments and complex product attributes, switching between row and columnar storage as needed.
Business Impact
The platform runs over 200 real‑time jobs processing >20 TB daily and 100 million incremental records, enabling minute‑level decision making during promotions and cutting infrastructure costs by 50 %.
Future Plans: AI Integration
Upcoming AI‑driven product selection will use vector search and an extended 64‑bit Roaring Bitmap to generate tags automatically, supporting scenario‑based campaigns such as holiday or event‑specific promotions.
Conclusion
Lazada’s migration to a unified Flink‑Hologres architecture demonstrates how modern stream processing and real‑time analytics can transform e‑commerce operations, delivering faster insights, lower costs, and a scalable foundation for future AI enhancements.
Alibaba Cloud Big Data AI Platform
The Alibaba Cloud Big Data AI Platform builds on Alibaba’s leading cloud infrastructure, big‑data and AI engineering capabilities, scenario algorithms, and extensive industry experience to offer enterprises and developers a one‑stop, cloud‑native big‑data and AI capability suite. It boosts AI development efficiency, enables large‑scale AI deployment across industries, and drives business value.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
