Why Apache Pulsar’s Architecture Beats Traditional Message Queues

This article explains Apache Pulsar’s fast‑growing adoption, its compute‑storage separation architecture, BookKeeper‑based persistence, multi‑tenant support, flexible subscription models, and fault‑tolerant design, providing a comprehensive overview for developers interested in modern distributed messaging middleware.

Su San Talks Tech
Su San Talks Tech
Su San Talks Tech
Why Apache Pulsar’s Architecture Beats Traditional Message Queues

In recent years Apache Pulsar has seen rapid adoption by many large companies, prompting a surge of documentation and courses.

History of message middleware (image omitted for brevity).

1 Architecture

Pulsar’s architecture separates computation from storage. Brokers handle message routing and load balancing, while Apache BookKeeper provides durable storage.

BookKeeper is a distributed write‑ahead log offering several conveniences:

Multiple ledgers can be created per topic.

Brokers can create, close, delete, and append to ledgers.

Ledgers are read‑only after closure unless explicitly written to.

Only a single writer per ledger prevents write conflicts and enables high write efficiency; if the writer fails, a recovery process restores the ledger state.

In addition to message data, ledgers store cursors indicating consumer positions, allowing automatic ledger deletion after consumption.

A ledger is an append‑only data structure with a single writer that replicates entries across multiple bookies.

1.2 Peer‑to‑Peer Nodes

Broker nodes do not store data and are peer‑to‑peer; if a broker fails, no data is lost and topics can be migrated to a new broker.

1.3 Scaling and Expansion

Because brokers are stateless, adding brokers for high‑traffic events (e.g., Double‑Eleven) is straightforward. BookKeeper storage can be expanded independently by adding bookies and increasing replica count without moving existing data.

1.4 Fault Tolerance

If a broker crashes, clients simply reconnect to another broker. BookKeeper maintains multiple replicas of data; a background thread restores data from failed nodes without immediate intervention.

2 BookKeeper Overview

Apache BookKeeper is an extensible, highly available, easy‑to‑operate distributed storage system.

2.1 Client Flexibility

Unlike Kafka, where clients read only from the leader, BookKeeper clients can read from any bookie replica, improving read availability, balancing client traffic, and allowing higher read throughput by increasing client count.

Client‑server communication uses Netty for asynchronous I/O, achieving high throughput with minimal resources.

2.3 I/O Isolation

BookKeeper separates write, tail‑read, and catch‑up read paths into three independent I/O pipelines, reducing latency jitter and improving throughput for mixed workloads.

Illustration of the three I/O paths (image omitted).

3 Multi‑Tenant Support

Pulsar manages large clusters with tenants, each having independent authentication, authorization, storage quotas, message TTL, and isolation policies.

Tenant syntax: persistent://tenant/namespace/topic Example: three departments as separate tenants (image omitted).

4 Message Model

4.1 Message Structure

Messages are stored in segments (ledgers), each containing entries, which in turn contain individual messages. A message ID consists of ledger‑id, entry‑id, batch‑index, and partition‑index.

Segments and entries are BookKeeper concepts. When Pulsar acts as a stream platform, it batches multiple messages into a single entry for higher throughput.

4.2 Creation Process

Message creation steps (image omitted):

Select a partition.

Send to the broker managing that partition.

The broker concurrently writes the message to N bookies (configurable, e.g., N=3).

Bookies acknowledge; the broker considers the write successful after receiving the configured number of acknowledgments, balancing consistency and latency.

5 Consumption Model

5.1 Overview

Producers send messages to topics, which are partitioned across brokers. Brokers receive messages, store them in BookKeeper, and forward them to consumers via subscriptions.

Brokers also enforce rate limiting on producers.

5.2 Subscription Types

Exclusive: only one consumer per topic.

Failover: multiple consumers, but only one active; others act as standby.

Shared: multiple consumers share messages via round‑robin; each message is delivered to only one consumer.

Key_Shared: messages with the same key are routed to the same consumer, preserving order while allowing concurrency.

Key_Shared combines the concurrency of Shared with per‑key ordering guarantees.

5.3 Cursors

Each consumer’s subscription maintains a cursor tracking its read position. Cursors are persisted in BookKeeper, enabling ACK handling, message redelivery, and optional reset to earlier positions. Non‑durable exclusive subscriptions provide cursor‑less reads.

6 Broker Proxy

Clients interact with a proxy layer in front of brokers, abstracting the underlying BookKeeper details.

7 Zookeeper

Pulsar stores metadata such as policies and configuration in system topics to minimize Zookeeper usage, though Zookeeper still holds some metadata for service discovery and configuration.

8 Summary

Pulsar is a powerful middleware that separates compute from storage, supports multi‑tenant isolation, and offers easy scaling and robust fault tolerance.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

architectureMessagingPulsarDistributedBookKeeper
Su San Talks Tech
Written by

Su San Talks Tech

Su San, former staff at several leading tech companies, is a top creator on Juejin and a premium creator on CSDN, and runs the free coding practice site www.susan.net.cn.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.