An Overview of Apache RocketMQ: Origin, Concept Model, Storage, Deployment, and Best Practices

This article introduces Apache RocketMQ by covering its origin, core concepts such as topics, producers and consumers, storage architecture with CommitLog and ConsumeQueue, deployment components like brokers and name servers, and practical best‑practice guidance for handling duplicates, ordering, and message replay.

Architects' Tech Alliance
Architects' Tech Alliance
Architects' Tech Alliance
An Overview of Apache RocketMQ: Origin, Concept Model, Storage, Deployment, and Best Practices

Guest introduction: Liu Zhendong, Alibaba middleware technology expert, 2016 Middleware Performance Challenge runner‑up, with extensive experience in distributed system design and optimization, currently leading exploration and innovation for Apache RocketMQ.

The presentation covers RocketMQ’s origin, concept model, storage model, deployment model, and a summary of best practices.

1. Origin of RocketMQ

Like many products, RocketMQ was created to solve a specific problem; its early prototype was a monolithic “big stone” containing all required functions. As the business grew and thousands of developers contributed, performance bottlenecks emerged, prompting a decomposition into a distributed architecture.

The distributed design brings decoupling, allowing asynchronous communication so that changes in the lower layers do not affect upper‑layer applications. It also provides peak‑shaving capabilities and a natural ordering mechanism that makes RocketMQ act as a queue engine, preventing “collision” when multiple applications issue requests simultaneously.

2. Concept Model

In RocketMQ, a Topic represents a logical address, a Producer sends messages, and a Consumer receives them. In production environments, topics are often partitioned, and a single producer may have many subscribers, while a single consumer group may contain multiple consumers, forming one‑to‑many and many‑to‑one relationships.

The extended model shows two producers, two distributed topics, each topic backed by two physical Message Queues, a broker device, and two consumers. Consumer groups that share the same group ID receive broadcast subscriptions, while different groups operate independently.

3. Storage Model

RocketMQ stores messages using a combination of CommitLog and ConsumeQueue . The CommitLog holds the full message body and metadata; each ConsumeQueue corresponds to a MessageQueue and stores only the offset, size, and tag hash of messages in the CommitLog. This separation allows recovery of messages even if a ConsumeQueue is lost, as long as the CommitLog remains intact.

4. Deployment Model

In a real deployment, a Broker is the data node that stores messages, while a Nameserver provides service discovery. A producer first queries the Nameserver for the routing information of a target Topic (which brokers host the topic and which queues exist), then sends the message to the appropriate broker. Consumers follow the same lookup process before pulling messages.

5. Best‑Practice Summary

The following practical guidelines were distilled from real‑world experience and are also used as interview questions for Alibaba middleware positions:

Q1: How to avoid duplicate messages in a distributed messaging system? The root cause is unreliable networks. Ensure idempotent business logic on the consumer side and assign a unique identifier to each message, recording successful processing in a deduplication log. If a message ID already exists in the log, skip processing.

Q2: How to maintain message order during scaling without stopping writes? 1) Scale exponentially while keeping the same key hash mapping to old or new queues; 2) Record the maximum offset of the old queue before scaling; 3) For each consumer group, finish consuming the old queue before reading from the new one (disable reads on the new queue until the old data is drained).

Q3: How to replay messages in a distributed messaging system? Adjust the consumer offset to an earlier position; the system will re‑deliver messages from that offset.

Source: Alibaba Middleware

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Backend Developmentbest practicesMessage QueueRocketMQApacheDistributed Messaging
Architects' Tech Alliance
Written by

Architects' Tech Alliance

Sharing project experiences, insights into cutting-edge architectures, focusing on cloud computing, microservices, big data, hyper-convergence, storage, data protection, artificial intelligence, industry practices and solutions.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.