Why Kafka Beats Redis List: A Deep Dive into Modern Messaging Middleware
This article compares Redis list, Kafka, and Pulsar as messaging middleware, explaining their architectures, strengths, and weaknesses—including queue fundamentals, partitioning, cursor management, consumer groups, high‑availability mechanisms, storage strategies, and consumption models—to help readers choose the right solution for large‑scale systems.
The article begins by defining the most basic queue as a double‑ended queue implemented with a doubly linked list, describing the operations push_front and pop_tail . A producer adds messages to the front, while a consumer removes them from the tail, forming a simple in‑memory message queue.
1. Redis as a Queue
Redis provides the list data structure, which directly maps to the abstract queue operations ( lpush = push_front, rpop = pop_tail). Because Redis is optimized for high concurrency, using its list is usually more efficient than implementing a custom list.
However, Redis lists have several drawbacks:
Persistence : Redis is primarily in‑memory; AOF and RDB are auxiliary and can lose data on crash.
Hot‑key performance : All reads/writes for a given list hit a single Redis instance, making scaling difficult.
No acknowledgment : rpop permanently deletes a message, so failed consumption cannot be recovered.
Single consumer : Only one consumer can read a message; multiple services cannot share the same stream.
No re‑consumption : Once a consumer crashes after reading, the message is lost.
Some of these issues can be mitigated by using RocksDB/LevelDB‑based stores that speak the Redis protocol, but they still cannot solve the fundamental limitations of the list model.
2. Kafka Architecture
Kafka introduces the concepts of topic and partition . A topic is a logical stream; each topic is split into multiple partitions that can be distributed across different brokers, solving the hot‑key problem.
Messages are appended to the end of a partition log. Instead of deleting messages, Kafka maintains a cursor (offset) for each consumer group. Consumers acknowledge by advancing the cursor, enabling replay and reliable processing.
Consumer groups allow multiple independent groups to read the same topic while guaranteeing that a single partition is consumed by only one consumer within a group, providing a 1‑N broadcast model.
Retention is handled by segmenting each partition log into fixed‑size files. When a segment expires, the whole file can be deleted, avoiding costly per‑message deletions.
Kafka stores an index for each segment (offset → file position). The index is sparse (e.g., one entry every 10 messages) to reduce space while still enabling fast binary search to locate a target offset.
High availability is achieved with a leader‑follower replication model: each partition has one leader and multiple followers. Producers write to the leader; followers replicate asynchronously. Acknowledgment policies can be tuned (leader‑only vs. all‑followers) to balance latency and durability.
3. Pulsar Architecture
Pulsar separates compute and storage. The stateless broker handles client requests, while persistence is delegated to an Apache BookKeeper cluster.
Each partition is stored as a series of segments (called ledgers in BookKeeper). Segments are replicated across multiple BookKeeper nodes (bookies) with configurable replication factor (n), write‑ack quorum (t), and number of replicas (m).
Because storage is decoupled, adding new brokers does not require data migration; new segments can be written to any broker, and the underlying BookKeeper cluster can be scaled by adding bookies.
Pulsar also introduces a richer subscription model that abstracts consumption patterns:
exclusive : only one consumer can attach.
failover : one active consumer, others standby.
shared : round‑robin load‑balancing across consumers.
key‑shared : messages with the same key go to the same consumer.
These modes support both queue‑style and stream‑style processing.
4. Comparative Strengths and Weaknesses
Kafka excels in performance (up to 1 M TPS), low latency, mature tooling, and a rich ecosystem (Kafka Streams). Its drawbacks include difficulty scaling partitions, costly rebalancing, and limited elasticity when a broker becomes a hotspot. Pulsar offers easier horizontal scaling thanks to stateless brokers and segment‑level distribution across BookKeeper nodes, as well as flexible consumption models. However, the added BookKeeper layer introduces its own complexity. 5. Summary Both Kafka and Pulsar solve the fundamental problems of Redis‑based queues—persistence, acknowledgment, multi‑consumer support, and hot‑key mitigation—by introducing partitions, cursor‑based consumption, and replicated storage. Kafka’s monolithic broker‑plus‑log design provides high throughput but can be hard to scale, while Pulsar’s compute‑storage separation offers better elasticity at the cost of added architectural complexity. <code>{ "topic-foo": { "groupA": { "partition-0": 0, "partition-1": 123, "partition-2": 78 }, "groupB": { "partition-0": 85, "partition-1": 9991, "partition-2": 772 } } }</code>
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Code Ape Tech Column
Former Ant Group P8 engineer, pure technologist, sharing full‑stack Java, job interview and career advice through a column. Site: java-family.cn
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
