How Xiaomi’s Talos Redefined Distributed Messaging for Massive Scale

Xiaomi’s Talos, a self‑developed distributed message queue, tackles the limitations of Kafka by separating storage and compute on HDFS, introducing stateless scaling, advanced consistency, partition delay allocation, and extensive performance and resource optimizations to support trillions of daily messages and multi‑tenant workloads.

MaGe Linux Operations
MaGe Linux Operations
MaGe Linux Operations
How Xiaomi’s Talos Redefined Distributed Messaging for Massive Scale

Business Background

Before 2015 Xiaomi used Kafka 0.8, which suffered from storage‑compute coupling, data imbalance, difficult scaling and recovery, and consumer rebalance issues.

They needed a fast‑scaling, stateless, fault‑tolerant queue with minimal consumer impact and strong multi‑tenant, cross‑datacenter replication features.

Thus Xiaomi built Talos, a self‑developed distributed message queue targeting internal business needs and external ecosystem partners, comparable to AWS Kinesis and Apache Kafka.

Architecture and Key Issues

Talos separates storage and compute: messages are stored as files on HDFS, while stateless TalosServer nodes handle request routing using consistent hashing for partition scheduling and load balancing.

Meta information and control flow are stored in ZooKeeper, which also manages server registration, topic DDL broadcasts, etc.

Key challenges include:

DFS client tailing read – modified HDFS client to allow reading the latest block while it is being written.

Consistency model – ensuring only one TalosServer writes to a partition at a time using HDFS RecoverLease and a fencing mechanism.

Partition delayed allocation – reducing unnecessary partition migrations during node restarts or rolling upgrades by delaying reassignment.

Performance and Resource Optimization

Talos processes over 2 trillion messages per day, peaks at 40 million msgs/s, and stores 1.3 PB daily across 13 000+ topics and 15 000+ downstream jobs.

Optimizations include:

Thread‑pool redesign with a “memory‑aware min‑heap” that assigns requests to the least‑loaded thread while preserving affinity for the same topic‑partition.

Write aggregation – merging multiple small I/O operations into larger ones, reducing flush frequency and increasing QPS from ~1 K to >10 K per node, while cutting P99 latency from 70 ms to 5 ms.

GC tuning – switching from CMS to G1, adjusting heap parameters and mixed‑GC intervals to keep GC pauses under 70 ms.

Client‑side addressing and traffic‑aware consistent hashing – eliminating most request forwarding, saving 40 % bandwidth and halving P95 latency.

Load‑balancing refinement – adjusting virtual node counts based on per‑node traffic to equalize daily flow, reducing node‑to‑node traffic variance by over 50 %.

Platform Efficiency

Talos provides a unified monitoring and metering framework consisting of metric collection, aggregation, and visualization. Agents on each service push metrics to Talos topics, which are streamed into Druid for dashboards and Falcon alerts.

The platform supports multi‑dimensional, multi‑view monitoring and real‑time billing based on these metrics.

Resource management for partitions is automated: traffic‑based thresholds identify over‑requested quotas, allowing 70 % of requests to be auto‑approved, while outlier cases trigger manual review.

Planning and Vision

Future goals focus on two pillars:

Business‑centric value delivery – adding features such as transactions for e‑commerce finance and replication for cloud services, aiming for financial‑grade reliability, multi‑active‑city and cross‑region capabilities.

Continuous learning and cloud‑native evolution – integrating compute into messaging, exploring Service Mesh and Serverless architectures, and developing a Message Mesh for next‑generation decoupled communication.

Further detailed articles will dive into Talos’s consistency model, load‑balancing practice, and other key topics.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Performance OptimizationDistributed MessagingStorage Compute SeparationTalos
MaGe Linux Operations
Written by

MaGe Linux Operations

Founded in 2009, MaGe Education is a top Chinese high‑end IT training brand. Its graduates earn 12K+ RMB salaries, and the school has trained tens of thousands of students. It offers high‑pay courses in Linux cloud operations, Python full‑stack, automation, data analysis, AI, and Go high‑concurrency architecture. Thanks to quality courses and a solid reputation, it has talent partnerships with numerous internet firms.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.