Big Data 13 min read

How Leapmotor Scaled to 1M Cars with a Real‑Time Flink Data Platform

Leapmotor’s rapid growth to one million production cars drove a shift from daily batch data to minute‑level real‑time analytics, prompting the adoption of Flink as the core engine of a multi‑layered big‑data platform that handles massive IoT signals, supports fault diagnosis, and integrates batch and streaming workloads on the cloud.

Alibaba Cloud Big Data AI Platform

Oct 24, 2025

How Leapmotor Scaled to 1M Cars with a Real‑Time Flink Data Platform

Based on a 2025 Cloud Expo presentation, Leapmotor (零跑科技), founded in December 2015 in Hangzhou, is the only Chinese new‑energy car maker with fully self‑developed hardware and the highest vertical integration, covering vehicle design, R&D, manufacturing, and intelligent driving.

Just before the talk, Leapmotor celebrated the rollout of its 1 millionth production vehicle, achieving this milestone in only 343 days after reaching 500 k units, highlighting an unprecedented growth rate.

The surge in sales and expanding model lineup transformed data needs from simple T+1 offline reports to minute‑ and second‑level real‑time data, driving the construction of a real‑time computing system.

Why Flink?

Before Flink, the industry used Storm ("at‑least‑once" semantics) and Spark Streaming (micro‑batch, minute‑level latency). Flink, released in 2014, provides true low‑latency, high‑throughput stream processing with exactly‑once guarantees, unifying batch and stream as "batch is a special case of stream".

Flink’s four core advantages—low latency, high throughput, exactly‑once semantics, strong state management, and flexible time semantics—made it the engine of choice for Leapmotor.

Leapmotor’s Big‑Data Platform Architecture

The platform consists of five layers:

Data source layer: relational business data (ERP, MES, etc.), IoT sensor data from vehicle T‑Box (semi‑structured), and unstructured files, videos, images.

Infrastructure layer: compute and storage (OSS, MaxCompute, Hologres, HBase, Doris, HDFS, Paimon) and compute engines (MaxCompute, Hologres, Flink, Hive, Spark) with GPU/CPU resources and development platforms (DataWorks, AiWorks).

Data asset layer: data warehouse modeling, algorithm training, and inference, with layers for source, cleaning, common dimensions, and data marts, plus model marketplace.

Data service layer: BI reports, ad‑hoc queries, APIs, and data governance (user, metadata, quality, scheduling, monitoring).

Data application layer: BI dashboards, app services, marketing screens, battery fault alerts, quality defect detection, etc.

Vehicle Signal Real‑Time Analysis

Sensor data from the CAN bus is sent via T‑Box to cloud Kafka, parsed and cleaned by Flink, then written to the real‑time warehouse Hologres (real‑time) and MaxCompute (offline) for downstream applications.

Challenges include petabyte‑scale data, the need for real‑time slicing, thousands of signal types (over 8 000 in premium models), diverse use cases, and complex data structures, requiring high‑throughput, low‑latency processing with strong accuracy.

Real‑Time Fault Diagnosis

Flink writes processed data to Hologres, leveraging UPSERT for efficient state updates. Quality rule monitoring reads rules via Flink CDC, joins with signal data, and writes results back to Hologres for traceability. AI models consume real‑time features from Flink to predict fault probabilities, triggering proactive maintenance.

Hologres provides immediate visibility of writes, unlike ClickHouse or Doris which depend on Flink checkpoint intervals, making it crucial for latency‑sensitive fault diagnosis.

Integrated Real‑Time Computing Platform

Previously, Flink jobs were deployed on Kubernetes or YARN via CLI, causing resource contention between batch and streaming tasks, state loss, and fragmented monitoring.

The new Alibaba Cloud‑hosted Flink platform offers a unified UI for Flink SQL and JAR submissions, visual resource configuration, elastic scaling, consistent state management, comprehensive metrics, and simplified developer operations.

Performance Validation

POC tests showed Flink outperformed alternatives: ~60% faster than open‑source solutions for Kafka ingestion, 200% faster than Hive for large‑wide tables in MaxCompute, and up to 400% faster than ClickHouse for Hologres writes.

Overall gains include a 3‑5× increase in job performance, five‑fold storage compression, and reduced costs.

Future Plans

Leapmotor will deepen Flink integration with data lakes (e.g., Paimon) for unified batch‑stream warehousing, and with AI to provide real‑time features for models, explore Flink Agents for multimodal data, and optimize long‑window feature computation.

These efforts aim to further lower costs, improve efficiency, and accelerate decision‑making in the intelligent automotive domain.

Through Flink and Hologres, Leapmotor has solved massive real‑time data challenges in connected vehicles and offers valuable industry insights for digital transformation.

big data Flink data platform real-time data cloud automotive

Written by

Alibaba Cloud Big Data AI Platform

The Alibaba Cloud Big Data AI Platform builds on Alibaba’s leading cloud infrastructure, big‑data and AI engineering capabilities, scenario algorithms, and extensive industry experience to offer enterprises and developers a one‑stop, cloud‑native big‑data and AI capability suite. It boosts AI development efficiency, enables large‑scale AI deployment across industries, and drives business value.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.