Best Practices of Jushuitan Cloud‑Native OLAP Architecture and Logistics Warning
This article presents Jushuitan's cloud‑native OLAP architecture, covering business background, data‑warehouse evolution, real‑time processing with Flink, Hologres, and Aerospike, and detailed logistics‑warning use cases, followed by technical challenges, future outlook, and a Q&A on implementation details.
Jushuitan is an e‑commerce‑focused SaaS and ERP provider that serves over one million merchants, offering end‑to‑end business solutions and a data team that supports full‑chain scenarios.
The company's core products include order management and warehouse management, which integrate multi‑platform orders, enable supplier push, and support efficient logistics and after‑sale processes.
Jushuitan's data products aim to embed data into business workflows, reducing loss, ensuring compliance, and improving multi‑role collaboration across the entire commerce chain.
The data‑warehouse architecture has evolved through five stages: from no warehouse to MySQL/SQL Server, then Greenplum for multi‑tenant isolation, followed by a mature stage using Alibaba Cloud ADB for Postgres and MySQL, and finally a large‑scale setup capable of handling petabyte‑level workloads.
Current technical stack combines DataWorks + MaxCompute for offline processing, Flink for real‑time computation, Hologres as a cloud‑native analytical store, and Aerospike for external state management, enabling both batch and streaming pipelines.
The logistics‑warning solution uses Hologres to store rule tables, Flink to match rules and set timers, and Aerospike to maintain external state, processing around 100 billion records daily with billions of timers and tens of terabytes of state.
Key challenges include reducing latency for long‑period replay, minimizing operational costs of external state stores, and achieving a seamless flow‑batch integration for on‑demand analytics.
Future directions focus on elastic resource scaling, stronger multi‑tenant isolation, intelligent operations, and tighter integration of Hologres with MaxCompute to complement each other's capabilities.
The Q&A section clarifies that rule matching is implemented with custom Flink functions rather than CEP, explains the use of Aerospike for late‑arriving streams, and discusses the difficulties of long‑period replay and SaaS multi‑tenant deployment on Alibaba Cloud.
DataFunSummit
Official account of the DataFun community, dedicated to sharing big data and AI industry summit news and speaker talks, with regular downloadable resource packs.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.