Big Data 19 min read

Understanding Transactions in Apache Kafka

This article explains the design, semantics, and practical usage of Apache Kafka's transaction API, covering why transactions are needed for exactly‑once processing, the underlying atomic multi‑partition writes, zombie fencing, consumer guarantees, Java API details, performance considerations, and operational best practices.

Architects Research Society
Architects Research Society
Architects Research Society
Understanding Transactions in Apache Kafka

Why Transactions?

Kafka’s transaction feature targets applications that follow a read‑process‑write pattern where both the input and output are Kafka topics, such as stream‑processing applications. Early stream processors tolerated occasional inaccuracies, but many modern use‑cases—especially in finance—require strict exactly‑once guarantees, meaning a message must be processed and its result written only if the entire operation succeeds.

Transactional Semantics

Atomic Multi‑Partition Writes

Transactions allow atomic writes to multiple topics and partitions: either all messages in the transaction are successfully written, or none are. If a transaction aborts, none of its messages become visible to consumers.

The read‑process‑write cycle is considered atomic only when the offset of the input record is committed together with the output records. Offsets are stored in the internal __consumer_offsets topic; a commit is only acknowledged when the offset and the output records are part of the same transaction.

Zombie Fencing

Each transactional producer is assigned a unique transaction.id . The transaction coordinator tracks the highest epoch for that ID; if a newer epoch appears, older producers are fenced off, preventing them from writing further messages.

Reading Transactional Messages

Consumers in read_committed mode receive only messages from committed transactions. They never see messages that belong to an open or aborted transaction, although they cannot determine which transaction a given message originated from.

Transaction API in Java

The Java client provides a server‑side, protocol‑level transaction API. A typical usage pattern involves:

Configuring the producer with a transaction.id and calling initTransactions() .

Creating a KafkaConsumer that reads in read_committed mode.

Within a loop: beginning a transaction, processing records, producing output records, sending offset commits to the __consumer_offsets topic, and finally calling commitTransaction() .

How Transactions Work

Two new components are introduced in Kafka 0.11.0: the transaction coordinator (running on each broker) and the transaction log (an internal topic). The coordinator owns a subset of log partitions based on a hash of the transaction.id . All state changes are written to the log, which is replicated using Kafka’s robust replication protocol.

Data flow can be broken into four interactions:

Producer ↔ Coordinator : Register the transaction, open partitions, and initiate the two‑phase commit when commitTransaction() or abortTransaction() is called.

Coordinator ↔ Transaction Log : Persist transaction state (ongoing, preparing, completed) to the log; the log stores only metadata, not the actual payload.

Producer ↔ Topic Partitions : After registration, the producer sends data to the target partitions, with additional validation to ensure the transaction is still valid.

Coordinator ↔ Partitions (Commit Phase) : In the first phase the coordinator writes a prepare_commit marker; in the second phase it writes a commit marker to each involved partition, making the transaction visible to consumers.

Practical Transaction Handling

Choosing a transaction.id

The transaction.id must remain stable across producer restarts to avoid zombie instances. It should be uniquely tied to the set of input partitions a producer instance processes, ensuring that no two live instances write to the same input partition under the same ID.

Transaction Performance

Producer Overhead

Transactional overhead is modest and independent of the number of messages written. The main costs are RPCs to register partitions, writing a commit marker per partition, and persisting state changes to the transaction log. Grouping many records into a single transaction maximizes throughput.

For a 1 KB record producer committing every 100 ms, throughput drops by only ~3 %. Smaller records or more frequent commits increase the penalty.

Longer transaction durations increase end‑to‑end latency because consumers must wait for the transaction to commit before seeing the data.

Consumer Overhead

Transactional consumers simply filter out aborted messages and ignore messages from open transactions. In read_committed mode they incur virtually no throughput penalty.

Further Reading

Key references include the original Kafka KIP describing transaction semantics, the design document (for deep implementation details), and the KafkaProducer Javadocs (for API usage examples).

Conclusion

The Kafka transaction API provides the building blocks for exactly‑once processing in stream‑processing applications, handling atomic multi‑partition writes, zombie fencing, and consumer guarantees. While the API ensures atomicity for Kafka‑internal state, external side‑effects still require additional coordination. Kafka Streams builds on these primitives to deliver end‑to‑end exactly‑once semantics for a wide range of use‑cases.

distributed systemsJavaStreamingTransactionsApache Kafkaexactly-once
Architects Research Society
Written by

Architects Research Society

A daily treasure trove for architects, expanding your view and depth. We share enterprise, business, application, data, technology, and security architecture, discuss frameworks, planning, governance, standards, and implementation, and explore emerging styles such as microservices, event‑driven, micro‑frontend, big data, data warehousing, IoT, and AI architecture.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.