Industry Insights 11 min read

How Haier’s AIoT Platform Scaled to Billions of Messages with Kafka Serverless on Alibaba Cloud

The article details how Haier Smart Home’s AIoT platform tackled massive device messaging demands by migrating its self‑built Kafka clusters to Alibaba Cloud’s Kafka Serverless, outlining the technical challenges, step‑by‑step migration plan, custom performance tuning, risk‑co‑governance, and the resulting improvements in stability, throughput, and operational efficiency.

Alibaba Cloud Native
Alibaba Cloud Native
Alibaba Cloud Native
How Haier’s AIoT Platform Scaled to Billions of Messages with Kafka Serverless on Alibaba Cloud

Haier Smart Home’s AIoT platform serves as the central hub for millions of connected appliances such as refrigerators, washing machines, and air conditioners, handling tens of billions of data reports and user commands daily. Rapid business growth pushed the platform’s Kafka‑based messaging backbone to the limits of stability, latency, and scalability.

To meet these demands, Haier partnered with Alibaba Cloud to migrate its large‑scale, self‑built Kafka clusters (tens of nodes) to the cloud‑native Kafka Serverless service. The migration focused on preserving data consistency, minimizing downtime, and maintaining the platform’s high‑availability guarantees.

Migration Strategy

Define precise migration granularity: non‑critical log topics were moved first for data‑consistency verification.

Execute a phased rollout: core command topics followed, using dual‑write (data‑double‑write) and consumption validation before cut‑over.

Complete full migration after a week of stability monitoring, then decommission the on‑premise clusters.

Technical Optimizations

Parameter tuning: customized batch.size and linger.ms based on Haier’s message size and producer frequency, significantly boosting throughput.

Elastic resource scaling: pre‑configured auto‑scaling thresholds aligned with Haier’s tidal traffic patterns to avoid cold‑start delays.

Risk co‑governance: joint development of hidden‑risk detection rules for batch size, leader distribution, and partition imbalance, integrated into Haier’s observability platform.

Results

After six months of stable operation on Kafka Serverless, the platform achieved 99.99% annual availability, processed over a trillion messages with sub‑10‑minute automatic scaling, and maintained a 99.995% command delivery success rate without latency or loss issues. Operational tasks such as scaling, failover, and version upgrades shifted from manual, days‑level processes to automated, minute‑level operations.

The migration freed Haier’s infrastructure team to focus on AIoT application architecture and global‑scale, multi‑cluster disaster‑recovery designs, while Alibaba Cloud gained valuable real‑world insights for enhancing its Kafka Serverless product.

Overall, the joint effort demonstrates a successful, risk‑aware, performance‑driven cloud migration that benefits both the customer and the cloud provider.

Performancecloud migrationoperationsKafkaAlibaba CloudAIoT
Alibaba Cloud Native
Written by

Alibaba Cloud Native

We publish cloud-native tech news, curate in-depth content, host regular events and live streams, and share Alibaba product and user case studies. Join us to explore and share the cloud-native insights you need.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.