Backend Development 8 min read

Why Kafka Outperforms RocketMQ: A Deep Dive into Zero‑Copy and Architecture Trade‑offs

The article compares Kafka and RocketMQ, explaining how Kafka’s use of sendfile zero‑copy reduces system calls and data copies, while RocketMQ relies on mmap, leading to lower throughput but richer features, and offers guidance on choosing between them based on scenario.

dbaplus Community

Nov 19, 2024

Why Kafka Outperforms RocketMQ: A Deep Dive into Zero‑Copy and Architecture Trade‑offs

Performance comparison

Alibaba middleware tests show that, under the same conditions, Kafka can achieve roughly 50% higher throughput than RocketMQ, while RocketMQ still processes about 100,000 messages per second.

Zero‑copy concept

Message queues persist data on disk. When a consumer reads data, the traditional path involves:

read() – copies data from disk device to a kernel buffer.

Kernel buffer is copied to user‑space memory.

write() – copies data from user space to the socket send buffer.

Socket buffer is copied to the network card.

This requires two system calls, four user‑kernel switches, and four data copies.

Zero‑copy mechanisms

mmap

mmap maps a kernel buffer directly into user space, eliminating the kernel‑to‑user copy. The program still performs read() and write(), so the total number of copies is only slightly reduced.

void *mmap(void *addr, size_t length, int prot, int flags, int fd, off_t offset);

sendfile

sendfile transfers data from a file descriptor to a socket descriptor entirely within the kernel, avoiding any copy into user space. This reduces the data path to a single system call, two user‑kernel switches, and only two copies (disk → kernel buffer → network card), achieving true zero‑CPU copy for the payload.

ssize_t sendfile(int out_fd, int in_fd, off_t *offset, size_t count);

Why Kafka is faster

Kafka uses sendfile for data transmission, while RocketMQ relies on mmap. Consequently, Kafka performs fewer copies and context switches, yielding higher throughput. Additionally, Kafka’s design focuses on a minimal feature set to maximize raw performance.

RocketMQ provides richer functionality—message filtering, delayed queues, dead‑letter queues, transactional support, etc.—which requires the application to access the message payload in user space. This necessity prevents the use of sendfile, leading to additional copies and lower throughput.

Choosing between Kafka and RocketMQ

For big‑data pipelines that involve frameworks such as Spark or Flink, Kafka is typically preferred due to its superior throughput. In environments where advanced messaging features (e.g., retry handling, dead‑letter queues, exactly‑once transactions) are required and the infrastructure supports RocketMQ, it may be the better choice.

Summary

Kafka achieves higher performance by using the sendfile zero‑copy mechanism and keeping its feature set minimal. RocketMQ sacrifices some throughput by using mmap to retain richer messaging capabilities. The trade‑off between raw speed and functional richness should guide the selection of the appropriate system for a given workload.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

kafka Message Queue RocketMQ mmap sendfile Zero‑copy performance comparison

Written by

dbaplus Community

Enterprise-level professional community for Database, BigData, and AIOps. Daily original articles, weekly online tech talks, monthly offline salons, and quarterly XCOPS&DAMS conferences—delivered by industry experts.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.