From ActiveMQ to Pulsar: The Evolution of Message Queues Explained
This article traces the development of message queues from early decoupling solutions like ActiveMQ and RabbitMQ, through high‑throughput designs such as Kafka and RocketMQ, to modern platform‑centric systems like Pulsar, while detailing core concepts, architecture diagrams, storage mechanisms and trade‑offs.
Message Queue Development History
Since 2003 many influential message queues have emerged, including Kafka, RocketMQ and Pulsar. Each generation addressed specific challenges: early queues focused on decoupling, the big‑data era demanded higher throughput and consistency, and the rise of cloud and container technologies pushed for platformization.
Stage 1: Decoupling
From 2003 to 2010, ActiveMQ and RabbitMQ were the main solutions, aiming to break strong coupling between systems and provide asynchronous processing.
Stage 2: Throughput and Consistency
During 2010‑2012, the big‑data wave required massive real‑time processing. Kafka was created to handle high‑throughput log collection. Later, Alibaba’s e‑commerce needs (reliability, ordering, transactions) led to RocketMQ, which borrowed many Kafka ideas but removed its dependence on Zookeeper.
Stage 3: Platformization
After 2012, cloud, Kubernetes and containerization drove the need to platform‑ify messaging. Pulsar was born to address repeated wheel‑building, weak tenant isolation, and high operational costs by separating compute (Broker) from storage (BookKeeper) and introducing layered, sharded architecture.
Common Architecture and Basic Concepts
Topic, Producer, Consumer
Using a cafeteria analogy: a topic is the type of food (rice, noodles, hot pot), a producer joins the queue (produces a message), and a consumer takes the food (consumes the message).
Partition
Partitions enable horizontal scaling. When many users arrive, the cafeteria adds more stalls (partitions), allowing parallel service and higher write throughput, which is why Kafka can achieve high throughput.
Mainstream Message Queue Storage Analysis
Kafka
Kafka’s nodes are not master‑slave; the master‑slave relationship exists per partition. Data is stored in partitions that are spread across brokers. A typical setup may have 1 producer, 1 consumer, 2 partitions, 3 replicas, and 3 broker nodes. Kafka relies on sequential disk writes, leveraging the OS page cache for high performance, but a large number of topics can degrade performance because the disk head must move frequently.
RocketMQ
RocketMQ uses a dual‑master, dual‑slave architecture and replaces Zookeeper with a lightweight Namesrv service. All topics share a single commit‑log file, providing extremely fast sequential writes. It also uses ConsumeQueue (offset index) and IndexFile (hash index) to locate messages efficiently.
Pulsar
Pulsar separates the broker (stateless compute layer) from BookKeeper (storage layer). It employs a layered and sharded design: brokers handle publish/consume, while BookKeeper stores data in segments across multiple storage nodes, offering better scalability, fault tolerance and dynamic expansion without moving data.
Summary
Message queue technology has continuously balanced trade‑offs between performance, reliability, scalability and operational complexity. Choosing the right product—Kafka for log‑driven pipelines, RocketMQ for Alibaba‑style e‑commerce workloads, or Pulsar for cloud‑native, multi‑tenant environments—depends on specific scenario requirements.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
21CTO
21CTO (21CTO.com) offers developers community, training, and services, making it your go‑to learning and service platform.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
