Understanding Message Queue Architectures: Redis List, Kafka, and Pulsar
This article compares the fundamentals and design trade‑offs of popular message‑queue middleware—Redis list, Kafka, and Pulsar—explaining their data structures, partitioning, persistence, consumer models, high‑availability mechanisms, and scalability challenges for developers and architects.
The article begins by introducing the concept of a basic queue implemented with a doubly‑linked list and shows how Redis list provides push_front (LPUSH) and pop_tail (RPOP) operations, making it a simple in‑memory message queue.
It then discusses the limitations of using Redis as a message broker, such as unreliable persistence, hot‑key performance bottlenecks, lack of acknowledgments, and inability to support multiple consumers or replay messages.
Kafka is presented as a purpose‑built messaging system that solves these problems by introducing topics, partitions, and a cursor‑based consumption model. Partitions are stored as sequential log segments, and each segment has a sparse index mapping offsets to file positions, enabling fast random access while keeping storage overhead low.
Kafka’s high‑availability is achieved through leader‑follower replication, where each partition has a leader handling reads/writes and followers replicating the data. Acknowledgment strategies can be tuned for latency or durability.
The article also covers Kafka’s drawbacks, such as limited elastic scaling of partitions and the need for manual rebalancing.
Next, Pulsar is introduced as a cloud‑native, compute‑storage separated messaging platform that leverages Apache BookKeeper for durable storage. Pulsar brokers are stateless and write messages to BookKeeper ledgers (segments), which are replicated across bookies, allowing seamless scaling and high availability.
Pulsar’s subscription model abstracts consumer groups and supports four consumption modes: exclusive, failover, shared, and key‑shared, providing flexible delivery semantics.
Finally, the article summarizes the architectural differences, highlighting how Kafka focuses on partitioned logs with on‑broker storage, while Pulsar separates compute and storage, distributing segments across BookKeeper nodes for easier scaling and fault tolerance.
Key configuration examples are shown:
{
"topic-foo": {
"groupA": {
"partition-0": 0,
"partition-1": 123,
"partition-2": 78
},
"groupB": {
"partition-0": 85,
"partition-1": 9991,
"partition-2": 772
}
}
}And a segment file layout example:
- /kafka/topic/order_create/partition-0
- 0.log
- 18234.log # segment file
- 39712.log
- 54101.logThese examples illustrate how offsets map to segment files and how sparse indexing reduces storage overhead.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Architect
Professional architect sharing high‑quality architecture insights. Topics include high‑availability, high‑performance, high‑stability architectures, big data, machine learning, Java, system and distributed architecture, AI, and practical large‑scale architecture case studies. Open to ideas‑driven architects who enjoy sharing and learning.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
