How ZTO Express Built ZMS: A Scalable Cloud‑Native Message Middleware Platform

ZMS is ZTO Express's cloud‑native message middleware platform built on RocketMQ and Kafka that automates deployment, provides a unified SDK, supports multi‑datacenter operation, and offers comprehensive monitoring, enabling seamless scaling and fault‑tolerant messaging for billions of daily events.

Zhongtong Tech
Zhongtong Tech
Zhongtong Tech
How ZTO Express Built ZMS: A Scalable Cloud‑Native Message Middleware Platform

ZTO Express processes tens of millions of parcels daily, relying on numerous business systems that are tightly coupled through message middleware to handle decoupling, peak‑shaving, asynchronous communication, data synchronization, and redundancy.

Since 2015, ZTO adopted message middleware at scale and, as the data volume and cluster count grew, created the ZMS (ZTO Message Service) platform. ZMS integrates RocketMQ and Kafka, offering automated deployment, topic/consumer approval workflows, a unified SDK, a management console, monitoring, alerting, and seamless scaling.

ZMS currently manages 17 clusters (7 Kafka, 10 RocketMQ), nearly 2,000 topics, over 3,000 consumer groups, more than 140 TB of stored messages, and processes billions of messages per day.

Automated Operations and Deployment

The platform provides a wizard‑style initialization that installs zms‑agent, Supervisor, and other base components on target hosts with a single script, eliminating the need for the portal to store host credentials.

Automated deployment diagram
Automated deployment diagram

Unified Client SDK

ZMS abstracts the underlying middleware, exposing a single API that hides differences between Kafka and RocketMQ, provides standard instrumentation for monitoring, and treats topics and consumer groups as cloud resources that can be requested without knowing the physical cluster.

SDK architecture diagram
SDK architecture diagram

Monitoring Data Collection Service

ZMSCollector periodically pulls metrics from Kafka and RocketMQ, persists them to InfluxDB, and aggregates SDK‑generated metrics, enabling persistent, visualized monitoring of message traffic.

Collector architecture diagram
Collector architecture diagram

Multi‑Data‑Center Solution

ZMS supports cold‑standby data‑center deployment. Each data‑center is treated as an environment; ZMSBackupCluster synchronizes ZooKeeper metadata between primary and backup sites, allowing seamless failover without application changes.

Backup cluster architecture
Backup cluster architecture

Open‑Source Release

On May 26, 2020, ZMS was open‑sourced on GitHub (https://github.com/ZTO-Express/zms), including usage guides, architecture documentation, and a community channel, inviting contributors to help build an integrated intelligent message‑operations platform.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

operationsRocketMQMessage Middlewarecloud‑native
Zhongtong Tech
Written by

Zhongtong Tech

Integrating industry and information for digital efficiency, advancing Zhongtong Express's high-quality development through digitalization. This is the public channel of Zhongtong's tech team, delivering internal tech insights, product news, job openings, and event updates. Stay tuned!

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.