Big Data 11 min read

Top Kafka Interview Questions: Basics, Advanced & Expert Level

This article compiles a comprehensive set of Kafka interview questions covering basic concepts, advanced mechanisms, and expert-level topics to help candidates prepare effectively for technical interviews.

Linux Cloud Computing Practice

Nov 27, 2025

Top Kafka Interview Questions: Basics, Advanced & Expert Level

Basic Level

Use cases and typical scenarios of Kafka Kafka is used for high‑throughput real‑time data pipelines, event sourcing, log aggregation, stream processing, and decoupling microservices.

ISR and AR definitions; meaning of ISR scaling ISR (In‑Sync Replicas) are replicas that are fully caught up with the leader. AR (Assigned Replicas) are all replicas assigned to a partition. ISR scaling refers to adding or removing replicas from the ISR set as they fall behind or catch up.

Message ordering guarantee Kafka guarantees order only within a single partition. The producer sends records to a partition (by key or custom partitioner) and the broker appends them sequentially.

Processing order of partitioner, serializer, and interceptor When a producer sends a record, the Partitioner runs first to select the target partition, then the Serializer converts the key/value to bytes, and finally any Interceptor instances are invoked (both before and after serialization) to modify or inspect the record.

Overall structure of the Kafka producer client The producer consists of a RecordAccumulator (batches records), a Sender thread (handles network I/O), a Metadata component (cluster info), and optional Interceptor chain.

Threads used by the producer client and their responsibilities Typically three threads: (1) the application thread that calls send() , (2) the Sender thread that manages socket connections and transmits batches, and (3) the IO thread (inside the Sender) that performs reads/writes on the network.

Design flaws of the older Scala consumer client The Scala consumer used a high‑level API that performed automatic partition assignment and offset management on the client side, leading to inefficient rebalance, lack of thread safety, and difficulty handling failures.

Statement about consumer count exceeding partitions and duplicate consumption If the number of consumers in a group is greater than the number of partitions, the excess consumers remain idle (receive no data). Duplicate consumption can occur during rebalance when partitions are reassigned before offsets are committed.

Achieving multi‑threaded consumption with non‑thread‑safe KafkaConsumer Instantiate a separate KafkaConsumer per thread, or use a single consumer in one thread and hand off fetched records to worker threads for processing.

Relationship between a consumer and its consumer group All consumers sharing the same group.id belong to a consumer group; the group collectively owns the partitions of subscribed topics, ensuring each partition is consumed by only one group member.

What happens behind the scenes when creating or deleting a topic with kafka-topics.sh The command updates ZooKeeper (or the KRaft metadata store) with the new topic metadata, notifies the controller broker, which then creates the log directories on the designated brokers and propagates the change to the cluster.

Increasing the number of partitions for a topic Yes. Use kafka-topics.sh --alter --topic myTopic --partitions newCount . The controller adds empty log directories for the new partitions and updates metadata.

Decreasing the number of partitions for a topic No. Kafka does not support partition reduction because it would risk data loss and break ordering guarantees.

Choosing an appropriate number of partitions when creating a topic Consider expected throughput, consumer parallelism, and hardware limits. A common rule is to have enough partitions to allow each consumer thread to process at least one partition while keeping the per‑partition load balanced.

Advanced Level

Internal topics maintained by Kafka and their purposes Examples include __consumer_offsets (stores committed offsets), __transaction_state (tracks transaction metadata), and __transaction_topic (used for transaction logs). Each is compacted to retain the latest state.

Preferred replica concept The preferred replica is the replica that is designated as the leader for a partition under normal conditions, usually the first replica in the replica list. It minimizes cross‑rack traffic.

Where partition assignment occurs and its principles Partition assignment is performed by the group coordinator (a broker) using the consumer’s partition.assignment.strategy (e.g., range, round‑robin). The coordinator collects the list of active members and distributes partitions according to the chosen algorithm.

Kafka log directory structure and index file types Each partition has a directory /logDir/topic-partition containing segment files ( .log ) and three index files per segment: .index (offset → position), .timeindex (timestamp → position), and .txnindex (transaction markers).

Locating a message by offset The broker uses the offset index file to map the requested offset to a byte position in the log segment, then reads the record from that position.

Locating a message by timestamp The broker searches the timestamp index ( .timeindex ) to find the nearest offset whose timestamp is >= the requested timestamp, then fetches the record.

Log retention mechanism Retention is configured by time ( log.retention.hours ) or size ( log.retention.bytes ). Segments older than the configured time or when total size exceeds the limit are deleted.

Log compaction mechanism Compaction retains the latest record for each key. The broker periodically scans segments, discarding older records with the same key while preserving the most recent one.

Underlying storage design of Kafka Kafka stores data as immutable, append‑only log files. It relies on the OS page cache for zero‑copy reads, sequential disk writes for high throughput, and memory‑mapped index files for fast lookups.

Principles behind delayed operations Delayed operations (e.g., delayed delete, delayed produce) are queued in a priority queue with a target timestamp. A background thread periodically checks the queue and executes operations whose delay has elapsed.

Role of the Kafka controller The controller is a broker elected via ZooKeeper (or KRaft) that manages partition leadership changes, replica assignments, and topic creation/deletion.

Consumer rebalancing principle (consumer coordinator & group coordinator) When a consumer joins or leaves a group, the group coordinator triggers a rebalance. The consumer coordinator on each broker communicates the new assignment to its members, who then pause fetching, commit offsets, and resume with the new partitions.

Evolution of high‑watermark (HW) and log end offset (LEO) across replicas Each replica maintains its own LEO (the offset of the next record to be written). The HW for a partition is the minimum LEO among all ISR members, representing the highest offset that is safely replicated.

Expert Level

Implementation of transactions in Kafka Transactional producers obtain a producerId and epoch , write records to a private transaction log, and on commit write a marker to __transaction_state . Consumers configured with isolation.level=read_committed only see committed records.

Out‑of‑sync replicas (OSR) and mitigation measures OSR are replicas that have fallen out of the ISR because they lag beyond replica.lag.time.max.ms . The controller can trigger a replica fetcher to catch up, or an administrator can increase replica.lag.time.max.ms or add more resources.

High reliability mechanisms (HW, leader epoch, etc.) Kafka uses the leader epoch to detect stale leaders, the high‑watermark to ensure reads only see fully replicated data, and quorum‑based ISR to tolerate broker failures.

Why Kafka does not support read‑write separation Kafka’s design treats the leader replica as both read and write source; followers are not exposed for reads because they may be out‑of‑sync, which would break ordering and consistency guarantees.

Implementation of a delayed queue A delayed queue can be built by writing messages with a future timestamp and configuring the consumer to poll with fetch.min.bytes and fetch.max.wait.ms , or by using the timestamp index to skip until the desired time.

Realization of dead‑letter queues (DLQ) and retry queues Producers or stream processors can redirect failed records to a dedicated DLQ topic after a configurable number of retries. Retry queues are separate topics with increasing back‑off delays.

Message auditing in Kafka Auditing can be achieved with producer/consumer interceptors that add audit headers, or by enabling log.message.format.version and storing immutable audit logs in separate topics.

Tracking message lineage Message lineage is preserved by propagating correlation IDs or trace IDs in record headers, allowing downstream services to reconstruct the processing path.

Consumer lag calculation and key metrics Lag = highWatermark - currentCommittedOffset . Important metrics include records-lag-max , consumer-fetch-manager-metrics , and fetch-rate .

Design aspects that give Kafka high performance Key factors: sequential append‑only logs, zero‑copy I/O, batch compression, efficient memory‑mapped indexes, and a pull‑based consumer model that avoids back‑pressure on producers.

big data Streaming Kafka questions

Written by

Linux Cloud Computing Practice

Welcome to Linux Cloud Computing Practice. We offer high-quality articles on Linux, cloud computing, DevOps, networking and related topics. Dive in and start your Linux cloud computing journey!

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.