Comparing Kafka and RocketMQ: Architecture, Availability, and Reliability Insights

This article examines the architectures of Kafka and RocketMQ, analyzes their availability and reliability mechanisms, evaluates their strengths and weaknesses, and proposes a hybrid MQ design that combines the benefits of both systems while simplifying dependencies and improving fault tolerance.

21CTO
21CTO
21CTO
Comparing Kafka and RocketMQ: Architecture, Availability, and Reliability Insights

Kafka

First, the system architecture of Kafka is introduced. The ecosystem includes Producer, Consumer, Kafka cluster, and ZooKeeper, which acts as a NameServer and stores metadata, providing leader election and coordination.

Kafka's availability depends on external dependencies. Since it only relies on ZooKeeper, which is highly available (2N+1 nodes), the overall cluster remains available.

Kafka's own availability is achieved through replication. Each partition has replicas distributed across different brokers, with one broker acting as the leader. Messages are written to the leader and replicated to followers.

The number of replicas determines availability; this is configurable.

Reliability

Reliability means that written messages are eventually consumed and not lost. Kafka ensures this by persisting messages to disk (synchronously or asynchronously) and replicating them to other nodes. As long as not all nodes fail permanently, data is not lost.

Evaluation

Advantages

Some functions are delegated to ZooKeeper, simplifying the broker.

High machine utilization due to mutual backup among brokers.

Disadvantages

Introducing ZooKeeper adds external dependency and operational complexity.

Implementing mutual backup is more complex than a simple master‑slave model.

RocketMQ

RocketMQ consists of Producer, Consumer, NameServer, and Broker. Unlike Kafka, RocketMQ implements a cluster‑mode NameServer that is essentially stateless.

RocketMQ's availability is discussed in two parts. The stateless NameServer can be deployed in a cluster without availability concerns. Brokers can be distributed across multiple master brokers; if one fails, others continue serving writes. Master‑Slave replication ensures continuity after a master failure.

Reliability

RocketMQ uses synchronous disk flushing for message persistence, providing higher reliability than asynchronous flushing because the producer receives acknowledgment only after the message is safely stored.

The synchronous flush process involves writing to page cache, triggering a flush thread, and then returning the result to the client.

Write to page cache and notify the flush thread.

The flush thread writes to disk and wakes waiting threads.

Front‑end threads return the write result to the user.

RocketMQ also offers synchronous double‑write to mitigate data loss in master‑slave replication.

Evaluation

No external dependencies, simplifying operations.

Master‑Slave structure can lead to low machine utilization, as slaves may be idle.

Typical deployments use one‑master‑one‑slave, limiting reliability for high‑throughput scenarios.

Other MQ Architectures

The author explores hybrid designs that combine Kafka’s replication strategy with RocketMQ’s lightweight NameServer, possibly removing ZooKeeper and using gossip protocols or consensus algorithms (Raft, Paxos) for leader election and metadata replication.

Proposed Simplified Architecture

No NameServer.

Broker cluster with mutual backup for availability and reliability.

Gossip protocol for metadata replication.

Consensus protocol for leader election.

Conclusion

The article introduced Kafka and RocketMQ architectures, discussed their availability and reliability, evaluated their pros and cons, and presented the author's thoughts on designing a simple yet robust message queue system.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

KafkaMessage QueueRocketMQAvailability
21CTO
Written by

21CTO

21CTO (21CTO.com) offers developers community, training, and services, making it your go‑to learning and service platform.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.