Understanding Kafka’s Architecture: Topics, Partitions, and Reliability
This article explains Kafka’s core architecture—including brokers, topics, partitions, offsets, producer and consumer mechanics, replication, availability, consistency, persistence, performance optimizations, and Zookeeper integration—providing a comprehensive guide for building reliable distributed messaging systems.
Kafka provides a producer‑broker‑consumer model for distributed messaging.
Broker : the Kafka cluster that stores messages, composed of multiple servers.
Topic : logical categorization of messages; each broker stores data per topic.
Producer : writes data to a specific topic.
Consumer : reads data from a specific topic.
Topic and Messages
Messages are organized into topics, each split into partitions; each partition contains an ordered sequence of messages identified by an incrementing offset.
Producer selects a topic; the message is appended to a partition according to the partitioning strategy.
Consumer selects a topic and an offset to start consuming; after processing, it records the offset for the next read.
Offset is the identifier used by Kafka to track message positions.
Backup
Messages are replicated per partition using a leader‑and‑N‑followers model; the leader handles reads/writes while followers replicate the leader’s data, ensuring high availability.
Producer Parameters
topic
partition
key (used for partitioning)
message
The following pseudo‑code shows how Kafka determines the target partition:
if topic is None
throw Error
p = None
if partition is not None
if partition < 0 or partition >= numPartitions
throw Error
p = partition
elif key is not None
p = hash(key) % numPartitions
else
p = round‑robin() % numPartitions
send message to the partition pRound‑robin is a simple polling strategy; the hash function uses MurmurHash.
Consumer
Traditional messaging systems support two models: queue and publish/subscribe. Kafka unifies them via consumer groups.
Each consumer is assigned a consumer‑group name; the system groups consumers by this name, replicates messages to each group, and ensures only one consumer per group processes a given message.
If all consumers share the same group, Kafka behaves like a queue.
If each consumer has a unique group, Kafka behaves like publish/subscribe.
Concurrent consumption can cause out‑of‑order processing; synchronizing consumers guarantees order but reduces concurrency. Kafka’s partition concept ensures ordering within a partition while allowing parallelism across partitions.
Message Delivery Semantics
Producer side
At‑most‑once: asynchronous send or synchronous send with zero retries.
At‑least‑once: synchronous send with retries on failure/timeout.
Exactly‑once: supported in newer versions.
Consumer side
At‑most‑once: read, acknowledge position, then process.
At‑least‑once: read, process, then acknowledge.
Exactly‑once: achievable if the downstream system provides idempotent updates or a two‑phase commit.
Availability
In normal operation all broker nodes are in‑sync . A node that becomes out‑of‑sync indicates a failure requiring fault‑tolerance handling.
In‑sync means the node can communicate with Zookeeper and, if a follower, its consumer position is close to the leader’s.
Each partition’s in‑sync replicas (ISR) form a set. Kafka ensures durability by:
Replicating data per partition (configurable replica count).
Failover: electing a new leader when the current leader goes out‑of‑sync, and removing/re‑adding followers from ISR as they fall behind or catch up.
Only when all ISR replicas acknowledge a message does the broker consider it committed, making it available for consumers.
If all but one replica fail, the service remains available; when all replicas are down, Kafka elects the first recovered node as a “dirty leader” to restore service.
Consistency
High availability may sacrifice strong consistency. To achieve stronger consistency:
Disable dirty‑leader election.
Set a minimum ISR (min_isr) so a message is committed only after being replicated to at least that many nodes.
Persistence
Kafka relies heavily on disk rather than memory because:
Disk is cheap, memory is expensive.
Sequential reads and pre‑fetching improve cache hit rates.
The OS page cache and write‑back mechanisms accelerate I/O.
Java objects have overhead, making in‑memory storage costly.
Garbage‑collection pauses increase with larger heap usage.
Queue‑based storage (append‑only) offers O(1) operations, unlike B‑tree’s O(log N).
Performance
Kafka optimizes performance by:
Converting many small I/O operations into fewer large ones.
Using sendfile to avoid data copies.
Supporting Snappy, GZIP, and LZ4 compression.
Adopting an NIO‑based reactor model (1 acceptor thread + N processor threads).
Data flow typically follows: Disk → kernel page cache → user buffer → socket buffer → NIC buffer. With sendfile, the path reduces to Disk → kernel page cache → NIC buffer, eliminating two copy steps and boosting throughput.
External Dependencies – Zookeeper
Broker nodes register a unique integer ID in Zookeeper:
/brokers/ids/[N] → host:port
Topic partition state is stored as:
/brokers/topics/[topic]/partitions/[N]/state → leader, ISR
Consumer group metadata:
/consumers/[group_id]/ids/[consumer_id] → {"topic1": #streams, ...}
Offsets are persisted at:
/consumers/[group_id]/offsets/[topic]/[N] → offset
Offset management options:
Manual management via low‑level API.
Zookeeper‑based storage (high‑level API, default in 0.8.2).
Kafka‑based storage (high‑level API, creates __consumer_offsets topic).
These Zookeeper nodes enable building monitoring tools such as KafkaOffsetMonitor.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
MaGe Linux Operations
Founded in 2009, MaGe Education is a top Chinese high‑end IT training brand. Its graduates earn 12K+ RMB salaries, and the school has trained tens of thousands of students. It offers high‑pay courses in Linux cloud operations, Python full‑stack, automation, data analysis, AI, and Go high‑concurrency architecture. Thanks to quality courses and a solid reputation, it has talent partnerships with numerous internet firms.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
