Why Kafka Outperforms RocketMQ: A Deep Dive into Zero‑Copy and Architecture Trade‑offs
The article compares Kafka and RocketMQ, explaining how Kafka’s use of sendfile zero‑copy reduces system calls and data copies, while RocketMQ relies on mmap, leading to lower throughput but richer features, and offers guidance on choosing between them based on scenario.
Performance comparison
Alibaba middleware tests show that, under the same conditions, Kafka can achieve roughly 50% higher throughput than RocketMQ, while RocketMQ still processes about 100,000 messages per second.
Zero‑copy concept
Message queues persist data on disk. When a consumer reads data, the traditional path involves:
read() – copies data from disk device to a kernel buffer.
Kernel buffer is copied to user‑space memory.
write() – copies data from user space to the socket send buffer.
Socket buffer is copied to the network card.
This requires two system calls, four user‑kernel switches, and four data copies.
Zero‑copy mechanisms
mmap
mmap maps a kernel buffer directly into user space, eliminating the kernel‑to‑user copy. The program still performs read() and write(), so the total number of copies is only slightly reduced.
void *mmap(void *addr, size_t length, int prot, int flags, int fd, off_t offset);sendfile
sendfile transfers data from a file descriptor to a socket descriptor entirely within the kernel, avoiding any copy into user space. This reduces the data path to a single system call, two user‑kernel switches, and only two copies (disk → kernel buffer → network card), achieving true zero‑CPU copy for the payload.
ssize_t sendfile(int out_fd, int in_fd, off_t *offset, size_t count);Why Kafka is faster
Kafka uses sendfile for data transmission, while RocketMQ relies on mmap. Consequently, Kafka performs fewer copies and context switches, yielding higher throughput. Additionally, Kafka’s design focuses on a minimal feature set to maximize raw performance.
RocketMQ provides richer functionality—message filtering, delayed queues, dead‑letter queues, transactional support, etc.—which requires the application to access the message payload in user space. This necessity prevents the use of sendfile, leading to additional copies and lower throughput.
Choosing between Kafka and RocketMQ
For big‑data pipelines that involve frameworks such as Spark or Flink, Kafka is typically preferred due to its superior throughput. In environments where advanced messaging features (e.g., retry handling, dead‑letter queues, exactly‑once transactions) are required and the infrastructure supports RocketMQ, it may be the better choice.
Summary
Kafka achieves higher performance by using the sendfile zero‑copy mechanism and keeping its feature set minimal. RocketMQ sacrifices some throughput by using mmap to retain richer messaging capabilities. The trade‑off between raw speed and functional richness should guide the selection of the appropriate system for a given workload.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
dbaplus Community
Enterprise-level professional community for Database, BigData, and AIOps. Daily original articles, weekly online tech talks, monthly offline salons, and quarterly XCOPS&DAMS conferences—delivered by industry experts.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
