Big Data 15 min read

How HuoLaLa Scaled Real‑Time Data Capture with Flink CDC: Architecture, Challenges, and Results

This article details HuoLaLa's logistics platform challenges with petabyte‑scale data, the selection of Apache Flink CDC for stable, compatible, and low‑latency data ingestion, the construction of a multi‑layer CDC capability, migration strategies, measurable performance gains, and future open‑source contributions.

Huolala Tech
Huolala Tech
Huolala Tech
How HuoLaLa Scaled Real‑Time Data Capture with Flink CDC: Architecture, Challenges, and Results

Business Introduction

HuoLaLa is a logistics platform founded in 2013, operating in 11 markets with over 363 cities, 900k active drivers and 12 million active users. It generates petabyte‑scale order, driver and IoT data daily, requiring stable and efficient data collection.

Challenges

Rapid business growth caused collection instability, resource contention, and data accuracy issues, leading to data breakage and potential public‑opinion risks. Four main challenges were identified: functionality, stability, compatibility, and data consistency.

Technology Selection

After evaluating open‑source solutions, the team chose Apache Flink CDC for its stability, timeliness, accuracy, and compatibility, enabling full‑database synchronization via Flink SQL and leveraging a rich connector ecosystem.

Overall Capability Construction

The CDC capability was built across four layers: stability, upper‑level applications, platform adaptation, and data architecture. Applications such as real‑time dashboards, alerts, and analytics consume CDC data; platform adapters provide configuration, perception, protocol, and SDK support; the data architecture plans future lake integration.

Compatibility work included supporting Canal features, key‑based routing to Kafka partitions, and enabling full‑database sync via Flink SQL.

Stability improvements added custom monitoring, Debezium metrics, and a CDC dashboard covering disconnects, queue sizes, and event rates.

Business Scenarios

Various business lines (e.g., small‑ride, moving, errands) generate TB‑PB data, feeding real‑time dashboards, BI reports, and transaction systems. Use cases include real‑time data processing, full‑database sync, and downstream consumption via Kafka.

In real‑time processing, Flink CDC captures MySQL changes to Kafka, which are then transformed and stored in OLAP databases, Hive, or data lakes for downstream use.

For full‑database sync, Kafka acts as an intermediate layer to avoid multiple MySQL connections, enabling unified view construction in Hive.

Link Switching

Switching to Flink CDC offers seamless migration with one‑click configuration or drag‑and‑drop UI. Simple switching uses dual collection and Flink SQL UNION; complex switching employs multi‑layer log analysis and statistical methods to ensure data consistency.

Statistical analysis compares data volume, record counts, rate distributions, and max differences to validate consistency between old and new pipelines.

Overall Benefits

Improved collection throughput, latency, and data size.

Positive user feedback on switch stability.

Reduced failure and smoke‑test incidents.

Latency dropped up to 80% (30 s → 3 s), and CDC’s checkpointing and GTID support enhanced recovery during high‑load archival tasks.

Storage consumption decreased 20‑60% compared with Canal due to a custom protocol compatible with Debezium.

Open‑Source Participation

The team contributed code to the Apache community, presented the production case at Apache Asia Community Over Code 2024, and collaborates on features such as Amoro, Paimon, Iceberg, and Dinky.

Future Outlook

Plans include supporting data lake ingestion with Flink CDC and YAML job configs, enhancing CDC alerting systems, and expanding real‑time synchronization to further improve data freshness and stability.

Apache Flinkreal-time dataFlink CDCData Ingestion
Huolala Tech
Written by

Huolala Tech

Technology reshapes logistics

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.