Understanding Apache RocketMQ: Architecture, Features, and Design Considerations

This article introduces the challenges that message middleware must address, examines how Apache RocketMQ tackles these issues through its architecture, deployment models, and feature set, and provides a concise overview of its core components, reliability mechanisms, and operational characteristics.

Architecture Digest
Architecture Digest
Architecture Digest
Understanding Apache RocketMQ: Architecture, Features, and Design Considerations

The article first outlines the typical problems faced by message middleware, discusses the difficulties encountered when solving them, and then evaluates whether Apache RocketMQ—a high‑performance, high‑throughput distributed message middleware open‑sourced by Alibaba—can address those problems.

What problems do message middlewares need to solve?

Publish/Subscribe

Basic publish/subscribe functionality, the core of any messaging system.

Message Priority

Priorities are usually expressed as integers; higher‑priority messages should be delivered first. RocketMQ persists all messages, making strict priority sorting expensive, so it does not provide native priority support but allows a workaround by using separate high‑priority and normal queues.

Two practical approaches are described:

Use multiple topics to represent coarse priority levels (high, medium, low), which satisfies most use‑cases with minimal impact.

Strict integer priorities (e.g., 0‑65535) would require extensive sorting and degrade performance, so they are generally discouraged.

Message Order

Ordered consumption means processing messages in the same sequence they were sent (e.g., order creation → payment → completion). RocketMQ can guarantee strict ordering while allowing parallel processing of different orders.

Message Filter

Broker‑side filtering

Filters messages on the broker according to consumer requirements, reducing network traffic but increasing broker load and complexity.

Taobao Notify supports type‑based, expression‑based, and tag filtering.

Taobao RocketMQ supports simple tag filtering as well as header and body filtering.

CORBA Notification also provides expression‑based filtering.

Consumer‑side filtering

Consumers implement custom filters, which may cause unnecessary traffic of irrelevant messages to the consumer.

Message Persistence

Common persistence methods include database storage (e.g., MySQL), KV stores (e.g., LevelDB, Berkeley DB), file‑based logs (e.g., Kafka, RocketMQ), and memory‑image snapshots. RocketMQ leverages the Linux file‑system cache to achieve high performance.

Message Reliability

Reliability scenarios include normal broker shutdown, broker crash, OS crash, power loss with immediate recovery, unrecoverable hardware failure, and disk failure. RocketMQ can guarantee no loss or minimal loss in the first four cases (depending on sync/async flush) and can achieve 99 % reliability in the latter two via asynchronous replication, with synchronous double‑write eliminating loss at the cost of performance.

Low‑Latency Messaging

When the broker is not back‑logged, RocketMQ uses long‑polling Pull to deliver messages in near‑real‑time, achieving latency comparable to Push.

At Least Once

Each message is delivered at least once; the consumer acknowledges only after successful processing, ensuring the guarantee.

Exactly Once

Requires no duplicate sending and no duplicate consumption. Implementing this in a distributed system incurs high overhead, so RocketMQ does not enforce it but expects applications to achieve idempotency.

Broker Buffer Full

Unlike some specifications that reject new events or discard existing ones, RocketMQ does not maintain a bounded in‑memory buffer; queues are persisted to disk and old data are periodically trimmed, effectively providing an “unlimited” logical buffer.

Back‑track Consumption

Consumers can re‑consume messages based on time; RocketMQ supports millisecond‑precision back‑tracking both forward and backward.

Message Accumulation

Accumulation can occur in memory buffers or persistent storage. Evaluation criteria include capacity, impact on throughput, effect on consumers, and disk‑read performance.

Distributed Transactions

RocketMQ implements two‑phase commit without relying on KV stores by using message offsets as identifiers, which may increase dirty page usage.

Scheduled Messages

RocketMQ supports delayed delivery at predefined levels (e.g., 5 s, 10 s, 1 min) but not arbitrary timestamps, as fine‑grained scheduling would hurt performance.

Message Retry

Retry strategies differ based on failure cause: data‑related failures may use a short (e.g., 10 s) retry interval, while downstream service outages may use a longer (e.g., 30 s) pause before retrying.

RocketMQ Overview

RocketMQ addresses the aforementioned challenges with a high‑performance, reliable, real‑time, and horizontally scalable design.

What is RocketMQ?

RocketMQ model
RocketMQ model

Key characteristics:

Queue‑model middleware with high performance, reliability, real‑time delivery, and distributed nature.

Producers, consumers, and queues can all be distributed.

Producers send to topics; consumers can use broadcast or clustering consumption.

Strict message ordering support.

Rich pull models and efficient subscriber scaling.

Billion‑level message accumulation capability.

Minimal external dependencies.

Physical Deployment Architecture

Physical deployment
Physical deployment

Components:

NameServer: stateless, can be clustered, no state synchronization.

Broker: Master‑Slave architecture; Masters (BrokerId 0) may have multiple Slaves; each Broker registers topics with all NameServers.

Producer: stateless, connects to a random NameServer for routing, then to the Master serving the target topic.

Consumer: connects similarly, can subscribe from Master or Slave based on configuration.

Logical Deployment Architecture

Logical deployment
Logical deployment

Key concepts:

Producer Group

A logical grouping of one or more producer instances (across machines or processes). It identifies a class of producers, aids operational monitoring, and ensures transaction callbacks if a producer crashes.

Consumer Group

A logical grouping of consumer instances. Instances share load evenly unless broadcast mode is selected, in which case each instance receives the full data set.

Data Storage Structure

Storage structure
Storage structure

RocketMQ separates data and index files, reducing file, I/O, and memory consumption, which enables low latency and strong horizontal scalability even under massive data and high concurrency.

Source: Alibaba Middleware Team Blog Link: http://jm.taobao.org/2017/01/12/rocketmq-quick-start-in-10-minutes/

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

BackendmiddlewareMessage QueueRocketMQ
Architecture Digest
Written by

Architecture Digest

Focusing on Java backend development, covering application architecture from top-tier internet companies (high availability, high performance, high stability), big data, machine learning, Java architecture, and other popular fields.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.