Manbang Group's Real-Time Computing, Data Architecture, and Product Practices
Manbang Group shares its practical experiences and insights on real-time computing, multi‑cloud platform architecture, data warehousing with Flink and Holo, real‑time decision and feature platforms, and future plans for scaling these systems to support logistics and recommendation algorithms.
Manbang Group presents a comprehensive overview of its real‑time computing platform, covering platform architecture, data architecture, and product implementations.
Platform Architecture : The platform is built on cloud‑native real‑time computing and an OLAP platform in collaboration with Alibaba Cloud, Flink, and Holo teams. It operates across multiple clouds (Alibaba Cloud and other providers) and uses Flink and Holo as core compute engines. Offline data is stored on Huawei Cloud.
The real‑time data warehouse (ODS, DWD, DWS layers) supports user, cargo, traffic, payment, transaction, and marketing domains, and provides a real‑time supply‑demand report akin to an ADS layer. Real‑time data also powers algorithmic features and operational strategies, such as driver intent detection.
Real‑time Decision Platform : Built on the real‑time warehouse, the platform offers data insights dashboards, intelligent attribution, real‑time alerts, and crowd profiling. It integrates with Flink/Holo to develop real‑time strategies and feeds a rule engine for A/B testing, covering core business functions like search ranking, driver recall, push attribution, fatigue control, and cargo pricing.
Real‑time Computing Platform Architecture : The architecture consolidates all backend and frontend event logs, as well as MySQL binlog, into a Kafka‑based data source. Flink + Holo handle computation and storage, while an API layer (Oneservice) exposes data directly to algorithms and products.
Flink Migration to Cloud‑Native : In Q3 2022, Manbang migrated 560 Flink jobs from a Hadoop‑based platform to Alibaba Cloud’s cloud‑native Flink service, improving SLA from 99.5% to 99.9% and reducing operational staff from three to one, saving ~600 person‑days and cutting costs by 35% through SQL optimizations.
Data Architecture : Real‑time data supports recommendation pipelines (recall, coarse ranking, truncation, fine ranking) and real‑time risk controls. Three data needs are identified: real‑time query, real‑time reporting, and strong real‑time data for algorithms, all served by Flink + Holo with view‑based optimizations.
Real‑time Feature Development : A batch feature generation framework using Flink + Holo reduces development cycles from 3 days to 2 days, consolidates 120 tasks into 16, and produces over 1,000 features per minute, improving QPS and resource efficiency.
Real‑time Product : The "Fire‑tower" (烽火台) product visualizes real‑time metrics, supports alert configuration, and pushes notifications to DingTalk. A real‑time data service platform (Myservice) exposes metrics via APIs backed by HBase, Redis, or Holo, with usage‑based charging.
Future Plans : Manbang aims to launch Real‑time Decision Platform 2.0 and Cloud‑Native OLAP 1.0, integrate Holo with other cloud providers, and continue multi‑cloud strategies to optimize OLAP engine selection for logistics scenarios.
DataFunSummit
Official account of the DataFun community, dedicated to sharing big data and AI industry summit news and speaker talks, with regular downloadable resource packs.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.