Scaling Real‑Time Data Warehousing for Double‑11: Flink + Hologres in Action
During the 2021 Double‑11 shopping festival, logistics provider DiSiFang upgraded its real‑time data warehouse with Flink and Hologres, enabling multi‑billion‑row joins, cutting costs by 50%, and delivering stable, low‑latency analytics that powered high‑frequency dashboards and improved overall delivery speed.
1 Business Introduction
DiSiFang, founded in 2004 in Shenzhen, is one of China’s earliest international logistics and global warehousing service providers, serving cross‑border e‑commerce merchants, platforms, and consumers through its GPN (direct shipping) and GFN (overseas warehousing) networks, with over 100 branches worldwide and more than 2 billion end users.
2 Business Challenges
To handle Double‑11 order peaks reaching tens of millions per day, DiSiFang leveraged big‑data‑driven resource optimization, expanding over 40 warehouses and sorting centers covering 500,000 m². It deployed proprietary sorting systems, barcode recognition, and AI‑enhanced verification to reduce mis‑picks to 0.03% and pursued automation, digitalization, and cloud‑based solutions.
The existing real‑time data warehouse could no longer meet the demand; evaluations of HBase, ClickHouse, and Druid revealed bottlenecks for trillion‑level multi‑table joins.
3 DiSiFang Real‑Time Data Warehouse Journey
Real‑Time Data Warehouse 1.0
Initially built on ADB for its high throughput and easy data sync via DTS and OTTER, the system suffered from limited concurrency and high latency under heavy dashboard queries.
Real‑Time Data Warehouse 2.0
Learning from version 1.0, DiSiFang adopted a Flink + Hologres architecture. Two data paths were created: (1) Binlog → DataHub → Flink → Hologres for high‑frequency, large‑volume metrics; (2) Direct Binlog sync to Hologres with ODS, DWD, and DWS layers for raw, cleaned, and aggregated data. This hybrid batch‑stream model leveraged Flink’s stream processing and Hologres’s powerful join capabilities, outperforming traditional real‑time databases.
4 Hologres in DiSiFang’s Real‑Time Data Warehouse
Why Hologres?
Real‑time capability with sub‑second query response for hundred‑billion‑row tables and massive concurrent writes.
Storage‑compute separation on Alibaba Cloud Pangu, enabling rapid scaling of compute or storage as needed.
Low operational cost—approximately one‑third of ADB—while maintaining high stability.
Hologres Application Scenarios
In OLAP analysis, Hologres supports both real‑time and offline queries, handling high‑concurrency writes and complex multi‑table joins efficiently.
Scenario 1: In‑warehouse operations—Binlog data is parsed to the ODS layer, minute‑level micro‑batches generate DWS wide tables, and data is refreshed every five minutes via DataWorks.
Scenario 2: Inter‑warehouse allocation—small tables are joined in Hologres using views, delivering millisecond‑level query performance and reducing scheduling overhead.
Current Limitations
Hologres lacks indexing on non‑null columns, which can slow joins on massive tables, and its PostgreSQL compatibility offers a limited function set, posing some development challenges.
5 Business Value
During Double‑11, the Flink + Hologres real‑time data warehouse powered high‑frequency dashboards, ensured zero‑failure operation, improved delivery timeliness, and enabled dynamic scaling to handle traffic spikes thousands of times higher than normal, thereby reducing operational costs.
Cassandra Database Introduction and Practice
Apache Cassandra is an open‑source distributed NoSQL database originally developed by Facebook. It offers linear scalability, high fault tolerance, and excels at handling massive data sets, ranking top in the DB‑Engines list for wide‑table databases. Alibaba Cloud partners with DataStax to provide a training course covering Cassandra fundamentals, big‑data analytics, and AI integration.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Alibaba Cloud Developer
Alibaba's official tech channel, featuring all of its technology innovations.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
