Top 10 Advanced Kafka Interview Questions with In‑Depth Answers
This article provides a comprehensive collection of advanced Kafka interview questions covering core architecture, storage mechanisms, replication, leader election, controller responsibilities, consumer group rebalancing, message semantics, partition assignment strategies, performance tuning, and practical tips for handling large message backlogs.
Kafka Core Architecture
A typical Kafka cluster consists of multiple producers, brokers, a Zookeeper ensemble, and consumer groups. Producers push messages to brokers, which store them in log files; consumers pull messages from brokers. Kafka supports horizontal scaling—adding more brokers increases cluster throughput.
Storage Mechanism
Each partition is appended to a log file. To avoid large log files, Kafka splits logs into segments and creates three files per segment: .index , .log , and .timeindex . Files are stored under a directory named topic‑name‑partition‑id .
Replication Mechanism
Each partition has multiple replicas: one leader and several followers. Only the leader handles client read/write requests; followers asynchronously replicate data from the leader. When a broker fails, Zookeeper triggers a new leader election from the in‑sync replica (ISR) set.
In‑Sync Replica (ISR) Set
ISR contains replicas that are sufficiently up‑to‑date with the leader. A follower leaves ISR if it falls behind the leader for longer than replica.lag.time.max.ms (default 30 s). Only ISR members are eligible for leader election unless unclean.leader.election.enable is set to true.
Leader Election
The controller (a broker elected via Zookeeper) manages leader elections. When a leader broker goes down, the controller selects the first alive replica in the AR (assigned replica) list that is also in ISR. If no ISR replica is available, the unclean.leader.election.enable flag determines whether a non‑ISR replica may become leader.
Controller Responsibilities
The controller maintains cluster metadata, creates/deletes topics, handles partition reassignments, performs leader elections, and monitors broker membership via Zookeeper watches.
Message Semantics
Kafka provides three delivery guarantees: at‑least‑once (default), exactly‑once (available with idempotent producers and transactions), and at‑most‑once (when a consumer commits offsets before processing). Producer acks settings (0, 1, -1/all) control the level of durability.
Partition Assignment Strategies
Kafka clients support three assignors: RangeAssignor (default, assigns contiguous partition ranges per consumer), RoundRobinAssignor (evenly distributes partitions across consumers), and StickyAssignor (balances load while minimizing partition movement during rebalances). Choosing partition.assignment.strategy influences consumer load distribution and rebalance overhead.
Consumer Group Rebalancing
Rebalancing occurs when group membership or subscription changes. The coordinator uses Heartbeat, JoinGroup, SyncGroup, LeaveGroup, and DescribeGroup requests to manage state transitions: Empty → PreparingRebalance → CompletingRebalance → Stable. StickyAssignor reduces partition movement during rebalances compared to Range or RoundRobin assignors.
Performance Tuning & Backlog Handling
To avoid message buildup, ensure producer throughput (batch size, concurrency) exceeds consumer processing speed. Optimize consumer logic, increase parallelism, and scale out both producers and consumers. When a backlog occurs, quickly identify whether production has spiked or consumption has slowed, then either add consumer instances (while also increasing topic partitions) or throttle producers.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Su San Talks Tech
Su San, former staff at several leading tech companies, is a top creator on Juejin and a premium creator on CSDN, and runs the free coding practice site www.susan.net.cn.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
