Big Data 21 min read

Comparing Apache Pulsar and Apache Kafka: Message Models, Consumption, Acknowledgment, Retention, and Architecture

This article provides a detailed comparison between Apache Pulsar and Apache Kafka, covering their message consumption models (queue vs. stream), subscription types, acknowledgment mechanisms, retention policies, and underlying layered architecture, highlighting Pulsar's unified API and segment‑based storage advantages.

Qunar Tech Salon
Qunar Tech Salon
Qunar Tech Salon
Comparing Apache Pulsar and Apache Kafka: Message Models, Consumption, Acknowledgment, Retention, and Architecture

In the annual Bossie Awards, Pulsar won the Best Open‑Source Database & Data‑Analysis Platform award, overtaking Kafka, and the article presents a comprehensive comparison of Pulsar and Kafka’s messaging models.

Message Consumption Model : Real‑time streaming systems use either Queue or Stream models. The Queue model delivers messages to one consumer out of many, while the Stream model enforces strict ordering with a single consumer per partition.

Pulsar’s Consumption Model : Pulsar abstracts a unified producer‑topic‑subscription‑consumer model that supports both Queue and Stream semantics. Topics map to distributed logs in Apache BookKeeper, and each subscription can be Exclusive, Failover, or Shared.

Subscription Types :

Exclusive (Stream): only one consumer in the subscription can receive messages.

Failover (Stream): multiple consumers, but only one is active; others take over on failure.

Shared (Queue): many consumers share the load, each message is delivered to a single consumer.

Exclusive and Failover subscriptions guarantee strict ordering, while Shared subscriptions provide high parallelism without ordering guarantees.

Acknowledgment (ACK) : Pulsar uses a cursor per subscription to track ACK status. It supports Individual Ack (selective) and Cumulative Ack (similar to Kafka’s offset commit). Shared subscriptions only allow Individual Ack.

Retention : After all subscriptions ACK a message, Pulsar deletes it. Pulsar also supports configurable retention periods and TTL for un‑acked messages, unlike Kafka which only relies on a retention period.

Layered Architecture :

Pulsar separates the stateless broker layer (message routing) from the persistent storage layer (Apache BookKeeper). Topics are split into Segments stored as Ledgers across multiple Bookies, enabling unlimited partition size, instant scaling, and seamless broker or Bookie failure recovery.

The architecture allows independent scaling of the broker and storage layers, providing cost‑effective expansion for more producers/consumers or longer message retention.

Comparison with Kafka :

Model: Kafka uses Producer‑Topic‑Consumer‑Group; Pulsar uses Producer‑Topic‑Subscription‑Consumer.

Consumption: Kafka focuses on Stream (exclusive) mode; Pulsar supports both Stream (Exclusive/Failover) and Queue (Shared) modes.

ACK: Kafka commits offsets; Pulsar uses cursors with cumulative and individual ACKs.

Retention: Kafka deletes based on time/size regardless of consumption; Pulsar deletes only after all subscriptions ACK, and also supports TTL.

Because Pulsar stores data as Segments in BookKeeper rather than partition‑centric logs, it avoids costly data re‑balancing during scaling, offering instant expansion and fine‑grained replica repair.

Overall, Apache Pulsar combines high‑performance streaming (as pursued by Kafka) with flexible queuing (as offered by RabbitMQ) through a unified API, segment‑based storage, and a layered, cloud‑native design.

distributed systemsStreamingApache PulsarApache KafkaretentionMessage ModelSegment Architecture
Qunar Tech Salon
Written by

Qunar Tech Salon

Qunar Tech Salon is a learning and exchange platform for Qunar engineers and industry peers. We share cutting-edge technology trends and topics, providing a free platform for mid-to-senior technical professionals to exchange and learn.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.