Backend Development 29 min read

Why Kafka Beats Redis List: A Deep Dive into Message Queue Architecture

This article compares popular message middleware such as Redis, Kafka, and Pulsar, explaining their underlying data structures, strengths and weaknesses, and how concepts like partitions, replication, cursors, and storage segmentation enable high performance, scalability, and reliability in modern distributed messaging systems.

ITFLY8 Architecture Home

Jun 10, 2021

Why Kafka Beats Redis List: A Deep Dive into Message Queue Architecture

1. The Most Basic Queue

The simplest message queue can be implemented as a double‑ended queue using a doubly linked list, with operations push_front (add to head) and pop_tail (remove from tail). Producers add messages, consumers remove them.

2. Redis Queue

Redis provides a list data type that supports lpush (push left) and rpop (pop right), directly mapping to the abstract queue operations. Redis lists are fast and well‑optimized, but they have drawbacks:

Persistence : AOF and RDB are not fully reliable; data can be lost on crash.

Hot‑key performance : High write/read rates on a single list can create a hot key that cannot be scaled by adding machines.

No acknowledgment : Once rpop removes a message, it cannot be recovered if the consumer fails.

No multi‑subscriber support : Only one consumer can read a message; broadcasting to multiple services is impossible.

No re‑consumption : Deleted messages cannot be replayed.

Redis 5.0 introduced stream , a more advanced structure inspired by Kafka, but it still has limitations.

3. Kafka

Kafka was designed as a dedicated message‑middleware system. It solves two core problems of Redis lists: hot‑key bottlenecks and data deletion. Kafka introduces partitions , splitting a logical topic into multiple partitions that can be distributed across different brokers, thus spreading load.

Kafka stores each partition as an append‑only log divided into segment files. A cursor (offset) tracks consumption without deleting data, enabling ACKs, replay, and multiple consumer groups.

Consumers belong to a consumer group ; each group has its own cursor, allowing independent consumption of the same topic. Only one consumer in a group can read a given partition, ensuring ordered processing.

When a consumer resets its cursor, Kafka uses the segment file name (which is the first offset) and a sparse index to locate the desired message efficiently.

4. Kafka High Availability

Each partition has a leader and multiple followers . Producers write to the leader, which replicates to followers. Acknowledgment strategies trade off latency versus durability: ack after leader write is fast but less reliable; ack after all replicas are in sync is safe but slower.

5. Kafka Advantages and Disadvantages

High performance (up to 1 M TPS), low latency, strong availability, mature tooling and ecosystem.

Drawbacks include limited elastic scaling (single broker can become a bottleneck), costly rebalancing, and performance degradation with many partitions.

6. Pulsar

Pulsar separates compute and storage: stateless brokers handle API requests, while Apache BookKeeper provides durable segment storage with configurable replication. Partitions are split into segments stored across multiple BookKeeper nodes, making storage scaling easy.

Because brokers are stateless, they can be scaled horizontally without moving data. BookKeeper’s ledger abstraction stores each segment with multiple replicas; if a BookKeeper node fails, other replicas serve the data.

Pulsar introduces subscriptions (exclusive, failover, shared, key‑shared) that abstract consumer groups and support both queue and stream consumption models.

7. Storage‑Compute Separation

The evolution from monolithic storage to distributed systems (NAS → HDFS → BookKeeper) reflects the need for scalable, reliable, low‑latency storage. Pulsar’s architecture exemplifies this trend, offering flexible consumption models while delegating durability and replication to a dedicated storage layer.

{
    "topic-foo": {
        "groupA": {
            "partition-0": 0,
            "partition-1": 123,
            "partition-2": 78
        },
        "groupB": {
            "partition-0": 85,
            "partition-1": 9991,
            "partition-2": 772
        }
    }
}

- /kafka/topic/order_create/partition-0
    - 0.log
    - 18234.log #segment file
    - 39712.log
    - 54101.log

- /kafka/topic/order_create/partition-0
    - 0.log
    - 0.index
    - 18234.log #segment file
    - 18234.index #index file
    - 39712.log
    - 39712.index
    - 54101.log
    - 54101.index

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

backend distributed systems Redis Kafka Pulsar

Written by

ITFLY8 Architecture Home

ITFLY8 Architecture Home - focused on architecture knowledge sharing and exchange, covering project management and product design. Includes large-scale distributed website architecture (high performance, high availability, caching, message queues...), design patterns, architecture patterns, big data, project management (SCRUM, PMP, Prince2), product design, and more.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.