Big Data 14 min read

Visual Guide to Kafka’s Architecture and Core Principles

This article explains Kafka’s architecture in depth, covering components such as producers, consumers, topics, partitions, replication, Zookeeper coordination, the Controller’s election and responsibilities, state machines, and the NIO‑based network model, using clear diagrams and analogies.

Shepherd Advanced Notes
Shepherd Advanced Notes
Shepherd Advanced Notes
Visual Guide to Kafka’s Architecture and Core Principles

Architecture Overview

Kafka consists of producers, consumers, consumer groups, brokers, topics, partitions, offsets, replication, and records.

Producer : creates messages and sends them to a broker.

Consumer : reads messages from a broker.

Consumer Group : a set of consumers that share the load of a topic without duplicate consumption; enables both point‑to‑point and broadcast patterns.

Broker : a Kafka server node that stores data.

Topic : logical channel for messages; producers write to a topic, consumers read from it.

Partition : an ordered log within a topic; each record has a unique offset.

Offset : unique identifier of a record within a partition, guaranteeing per‑partition ordering.

Replication : multiple copies of a partition across brokers; one replica is the leader, others are followers.

Record : the stored message consisting of key, value, and timestamp.

Producer‑Consumer Decoupling

The relationship is modeled as a producer‑consumer pattern with a Queue as an intermediate component, providing asynchronous, buffered communication and independent scaling.

Distributed Queue Analogy

Producers write letters to a post office (the queue); consumers retrieve letters, illustrating how Kafka decouples senders and receivers.

Zookeeper

Zookeeper stores metadata for brokers, topics, and partitions. It provides leader election for the Controller, cluster membership management, topic configuration, and replica management.

Controller

The Controller is an elected broker that manages partition leaders, follower replicas, and cluster metadata.

Election Process

On startup each broker reads the temporary /controller znode in Zookeeper. If the node contains a valid brokerid (not –1), another broker is already the Controller and the current broker aborts election. If the node does not exist or the ID is invalid, the broker attempts to create the znode; the broker that succeeds becomes the Controller.

Implementation Details

The Controller loads Zookeeper data into a context, watches node changes, and propagates updates to other brokers via a LinkedBlockingQueue. Event‑consumer threads consume these events and synchronize state across the cluster while preserving order.

Responsibilities

Handle broker join/leave events, update cluster metadata, and notify all brokers.

Create topics or expand partitions, assign replicas, and lead leader elections.

Maintain state machines for partitions and replicas, reacting to ISR changes and other events.

State Machines

Kafka defines two state machines.

PartitionStateChange

NonExistentPartition – partition has never been created or has been deleted.

NewPartition – created but no leader elected yet.

OnlinePartition – leader elected and the partition is active.

OfflinePartition – leader has failed.

ReplicaStateChange

NewReplica – replica created after topic and partition allocation.

OnlineReplica – replica becomes an active follower.

OfflineReplica – replica goes down, typically due to broker failure.

NonExistentReplica – replica has been deleted.

Network Model

Kafka’s network communication follows an NIO‑based Reactor model with three thread types: Acceptor threads accept new socket connections. Processor threads perform select and read operations. Handler threads execute business logic and respond to requests.

This design enables high‑throughput, low‑latency message handling.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

distributed systemsArchitectureState MachineZookeeperKafkaControllerNetwork Model
Shepherd Advanced Notes
Written by

Shepherd Advanced Notes

Dedicated to sharing advanced Java technical insights, daily work snippets, and the power of persistent effort.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.