Big Data 20 min read

Best Practices of Cloud‑Native OLAP Architecture and Logistics Warning at Jushuitan

This article presents Jushuitan's cloud‑native OLAP architecture, detailing its evolution, current big‑data stack—including DataWorks, MaxCompute, Flink, Hologres, and Aerospike—along with logistics warning workflows, rule‑matching mechanisms, real‑time processing challenges, and future scalability plans.

DataFunTalk

Feb 27, 2024

Best Practices of Cloud‑Native OLAP Architecture and Logistics Warning at Jushuitan

Jushuitan, a SaaS ERP provider for e‑commerce, introduced its data‑driven products that embed analytics into business processes to reduce loss, improve compliance, and enable multi‑role collaboration across the entire order‑to‑delivery lifecycle.

The data‑warehouse architecture has evolved through five stages: early online databases, migration to Greenplum, large‑scale cluster management, integration with Alibaba Cloud services (ADB for Postgres/MySQL), and finally a cloud‑native stack based on DataWorks + MaxCompute for offline processing and Flink + Hologres for real‑time analytics.

The current technical stack includes Kafka for data ingestion, a self‑developed synchronization middleware, Flink for rule matching and stateful stream processing, Aerospike for external state storage, and Hologres for both high‑QPS point queries and OLAP workloads. This enables a logistics warning system that monitors order timeliness, triggers alerts, and stores results in Hologres tables for downstream analysis.

Key components of the logistics warning pipeline are: rule tables in Hologres, real‑time rule evaluation in Flink, timer registration in Aerospike, and result persistence via Binlog. The system processes roughly 100 billion events daily, with timers and external state reaching tens of billions.

Future directions focus on elastic resource scaling, stronger multi‑tenant isolation, intelligent operations, longer‑cycle replay capabilities, and tighter integration with cloud services such as Lindorm, aiming to achieve a seamless stream‑batch unified computation model.

The Q&A section clarifies that rule matching is implemented with custom Flink functions rather than CEP, discusses handling of late data via external state, and explains challenges of long‑cycle replay under massive data volumes.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

cloud-native Big Data Real-time Processing Flink Data Warehouse Hologres OLAP

Written by

DataFunTalk

Dedicated to sharing and discussing big data and AI technology applications, aiming to empower a million data scientists. Regularly hosts live tech talks and curates articles on big data, recommendation/search algorithms, advertising algorithms, NLP, intelligent risk control, autonomous driving, and machine learning/deep learning.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.