How Haier’s AIoT Platform Scaled to Billions of Messages with Kafka Serverless on Alibaba Cloud
The article details how Haier Smart Home’s AIoT platform tackled massive device messaging demands by migrating its self‑built Kafka clusters to Alibaba Cloud’s Kafka Serverless, outlining the technical challenges, step‑by‑step migration plan, custom performance tuning, risk‑co‑governance, and the resulting improvements in stability, throughput, and operational efficiency.
Haier Smart Home’s AIoT platform serves as the central hub for millions of connected appliances such as refrigerators, washing machines, and air conditioners, handling tens of billions of data reports and user commands daily. Rapid business growth pushed the platform’s Kafka‑based messaging backbone to the limits of stability, latency, and scalability.
To meet these demands, Haier partnered with Alibaba Cloud to migrate its large‑scale, self‑built Kafka clusters (tens of nodes) to the cloud‑native Kafka Serverless service. The migration focused on preserving data consistency, minimizing downtime, and maintaining the platform’s high‑availability guarantees.
Migration Strategy
Define precise migration granularity: non‑critical log topics were moved first for data‑consistency verification.
Execute a phased rollout: core command topics followed, using dual‑write (data‑double‑write) and consumption validation before cut‑over.
Complete full migration after a week of stability monitoring, then decommission the on‑premise clusters.
Technical Optimizations
Parameter tuning: customized batch.size and linger.ms based on Haier’s message size and producer frequency, significantly boosting throughput.
Elastic resource scaling: pre‑configured auto‑scaling thresholds aligned with Haier’s tidal traffic patterns to avoid cold‑start delays.
Risk co‑governance: joint development of hidden‑risk detection rules for batch size, leader distribution, and partition imbalance, integrated into Haier’s observability platform.
Results
After six months of stable operation on Kafka Serverless, the platform achieved 99.99% annual availability, processed over a trillion messages with sub‑10‑minute automatic scaling, and maintained a 99.995% command delivery success rate without latency or loss issues. Operational tasks such as scaling, failover, and version upgrades shifted from manual, days‑level processes to automated, minute‑level operations.
The migration freed Haier’s infrastructure team to focus on AIoT application architecture and global‑scale, multi‑cluster disaster‑recovery designs, while Alibaba Cloud gained valuable real‑world insights for enhancing its Kafka Serverless product.
Overall, the joint effort demonstrates a successful, risk‑aware, performance‑driven cloud migration that benefits both the customer and the cloud provider.
Alibaba Cloud Native
We publish cloud-native tech news, curate in-depth content, host regular events and live streams, and share Alibaba product and user case studies. Join us to explore and share the cloud-native insights you need.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
