Comprehensive Guide to Kafka: Architecture, Core Concepts, and Configuration
This article provides an in‑depth overview of Apache Kafka, covering its use cases, comparison with other message queues, versioning, performance mechanisms, core concepts such as topics, partitions, offsets, consumer groups, rebalancing, replication, leader election, idempotence, transactions, compression, interceptors, request handling, and practical configuration tips for reliable streaming applications.
Fundamentals
Version Number
Version naming example: Scala 2.11 - kafka_2.11-2.1.1
The leading part of the version indicates the Scala compiler version used to build Kafka source code. Kafka server code is written entirely in Scala, which supports both object‑oriented and functional programming. The actual Kafka version is 2.1.1, where the first digit is the major version, the second is the minor version, and the third is the patch version.
Why does Kafka have high throughput and speed?
Sequential read/write: messages are appended to the end of a local file rather than written randomly.
Page Cache: Kafka leverages the OS page cache so most read/write operations are memory‑based, greatly improving speed.
Zero‑copy, batch read/write, batch compression.
Partitioning and indexing: each topic is stored in partitions; each partition is stored as a series of segments on a broker. An index file ( .index ) is automatically created for each segment.
Basic Concepts
Topic : logical container for messages, usually used to separate business domains. Partition : an ordered, immutable sequence of messages; a topic can have multiple partitions. Offset : a monotonically increasing position of a message within a partition. Replica : data redundancy for a partition. Producer : application that sends new messages to a topic. Consumer : application that subscribes to a topic to receive messages. Consumer Offset : the progress of a consumer within a partition. Consumer Group : a set of consumer instances that share a Group ID and collectively consume all partitions of a topic.
Consumers
Consumer Assignment Strategies
RangeAssignor : default strategy; partitions are divided evenly among consumers.
RoundRobinAssignor : assigns partitions to consumers in a round‑robin fashion.
StickyAssignor (added in 0.11.x): balances assignments while preserving existing partition allocations as much as possible.
CooperativeStickyAssignor : inherits Sticky logic but allows incremental rebalancing.
Push vs Pull
Kafka consumers pull messages actively from the broker.
Consumer Group
A consumer group consists of multiple consumer instances sharing a common Group ID. All instances coordinate to consume every partition of the subscribed topics.
Each group can have one or more consumer instances; the Group ID uniquely identifies the group within a Kafka cluster. Within a group, a single partition can be assigned to only one consumer instance at a time.
Point‑to‑Point vs Publish/Subscribe
If all instances belong to the same group, the model is point‑to‑point; if each instance belongs to a different group, the model is publish/subscribe.
Number of Consumer Instances
Ideally, the number of consumer instances equals the total number of partitions of the subscribed topic.
Offset Topic
Consumer offsets are stored as regular Kafka messages in the internal topic __consumer_offsets . This topic must provide high durability and support frequent writes.
The offset topic is created automatically when the first consumer starts. Its partition count is controlled by the broker parameter offsets.topic.num.partitions (default 50), and its replication factor is set by offsets.topic.replication.factor (default 3).
How Offsets Are Committed
Two methods exist: automatic commit (controlled by enable.auto.commit and auto.commit.interval.ms ) and manual commit.
Rebalancing
Rebalancing is the protocol that determines how all consumers in a group agree on partition assignments.
For example, a group with 20 consumers subscribed to a topic with 100 partitions will normally receive 5 partitions each after a rebalance.
Trigger Conditions
Change in group membership (consumer joins or leaves).
Change in subscribed topics.
Change in the number of partitions of a subscribed topic.
During rebalance, all consumers pause consumption until the process completes.
Typical rebalance scenarios include missed heartbeats, long processing times, or configuration changes.
# Heartbeat interval (ms)
heartbeat.interval.ms
# Session timeout (ms)
session.timeout.msPartitioning Mechanism
Default Round‑Robin Strategy : messages are assigned to partitions sequentially (0,1,2,… then wrap).
Random Strategy : messages are placed in any partition arbitrarily.
Message‑Key Strategy : all messages with the same key are sent to the same partition.
Compression Mechanism
Compression can be enabled on the producer side via the compression.type configuration (e.g., gzip).
Properties props = new Properties();
props.put("bootstrap.servers", "localhost:9092");
props.put("acks", "all");
props.put("key.serializer", "org.apache.kafka.common.serialization.StringSerializer");
props.put("value.serializer", "org.apache.kafka.common.serialization.StringSerializer");
// Enable GZIP compression
props.put("compression.type", "gzip");
Producer
producer = new KafkaProducer<>(props);Consumers automatically decompress messages upon receipt.
Interceptors
Kafka supports producer and consumer interceptors, which allow custom logic before sending a message or after a successful send, and before consuming a message or after offset commit.
Interceptors are configured via the interceptor.classes property, which takes a list of fully‑qualified class names.
Properties props = new Properties();
List
interceptors = new ArrayList<>();
interceptors.add("com.yourcompany.kafkaproject.interceptors.AddTimestampInterceptor");
interceptors.add("com.yourcompany.kafkaproject.interceptors.UpdateCounterInterceptor");
props.put(ProducerConfig.INTERCEPTOR_CLASSES_CONFIG, interceptors);
// ... other producer configsReplication Mechanism
Each partition has multiple replicas stored on different brokers to tolerate broker failures.
Replica Roles
Leader replica handles all client read/write requests; follower replicas only fetch data from the leader and stay in sync.
If the leader fails, Kafka uses ZooKeeper to detect the failure and elect a new leader from the in‑sync replica (ISR) set.
ISR (In‑Sync Replica) Set
ISR contains replicas that are currently synchronized with the leader. The leader is always part of ISR.
Replica lag is governed by replica.lag.time.max.ms (default 10 seconds). If a follower stays within this lag, it remains in ISR.
Unclean Leader Election
Controlled by unclean.leader.election.enable . Disabling it prevents a lagging replica from becoming leader, avoiding potential data loss.
Election Mechanism
Controller (Broker) Election
One broker becomes the controller by creating the temporary ZooKeeper node /controller . Other brokers watch this node for changes.
Preferred Leader Election
When a partition’s current leader fails, a new leader is chosen, preferably the “preferred replica” (the first replica in the AR list).
Controller
The controller coordinates the entire Kafka cluster using ZooKeeper.
It manages topic creation/deletion, partition reassignment, preferred leader election, broker membership changes, and propagates metadata updates to all brokers.
Idempotence
Since version 0.11.0.0, Kafka producers can enable idempotence by setting enable.idempotence=true , guaranteeing no duplicate messages on a single partition.
props.put("enable.idempotence", true);
// or
props.put(ProducerConfig.ENABLE_IDEMPOTENCE_CONFIG, true);Idempotence works only for a single partition; it does not protect against duplicates across multiple partitions.
Transactions
Transactional producers ensure atomic writes to partitions; either all messages in a transaction are committed or none are.
Enable idempotence, set transactional.id , and wrap sends between beginTransaction() and commitTransaction() . Consumers must set isolation.level to read only committed data.
producer.initTransactions();
try {
producer.beginTransaction();
producer.send(record1);
producer.send(record2);
producer.commitTransaction();
} catch (KafkaException e) {
producer.abortTransaction();
}Data Storage
Kafka stores messages in partitions; each partition is an ordered, immutable log. A partition consists of multiple segments, each with a data file ( .log ) and an index file ( .index ).
Request Model
Incoming requests are accepted by the network thread pool (default 3 threads, configurable via num.network.threads ) and then handed off to the IO thread pool (default 8 threads, configurable via num.io.threads ) for asynchronous processing.
Common Scenarios
Duplicate Consumption
If a consumer process is killed or crashes before committing offsets, the next restart will re‑read the same messages, causing duplicates.
Long processing times can also trigger a rebalance (controlled by max.poll.interval.ms ), leading to duplicate reads.
Message Loss
Automatic offset commits can cause loss if a message fails processing after the offset has been advanced.
Best practices: use producer.send(msg, callback) , set acks=all , increase retries , disable unclean.leader.election.enable , set replication factor ≥ 3, configure min.insync.replicas>1 , and prefer manual offset commits.
Message Ordering
Kafka guarantees order only within a single partition. To preserve order, either use a single‑partition topic or ensure that related keys are routed to the same partition.
When multiple consumer threads process messages, maintain ordering by routing messages with the same key to the same internal queue.
Rare Earth Juejin Tech Community
Juejin, a tech community that helps developers grow.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.