What Is Kafka? Overview, Architecture, Features, Deployment, and Sample Code
Kafka, an Apache‑developed distributed publish/subscribe messaging system, provides reliable, high‑throughput real‑time data streaming with producers, consumers, brokers, streams, and connectors, and the article explains its core concepts, architecture, advantages, deployment methods, use cases, and includes Java code examples for producers and consumers.
What Is Kafka
Kafka is an Apache Software Foundation project that implements a distributed, reliable publish/subscribe messaging system for real‑time and stream data. It moves data from one system to another, supporting both online and offline processing.
Kafka offers high‑throughput, low‑latency message transport, can scale across many nodes, and runs on various POSIX‑compatible operating systems.
1. Core Features
Producers/Consumers – reliable message delivery allowing applications to publish and consume messages.
Streams – processing and transformation of data streams within the Kafka cluster.
Connectors – integration with external systems for data flow in and out of Kafka.
2. Basic Architecture
Kafka consists of three main components: producers, consumers, and brokers.
Producer – an application that publishes messages to one or more topics.
Consumer – an application that subscribes to topics and consumes messages.
Broker – a Kafka server instance that receives, stores, and forwards messages.
Kafka provides a simple yet reliable messaging service for real‑time data transfer between systems.
3. Implementation Concepts
Kafka relies on two core concepts: the publish/subscribe model and partitioning.
Publish/Subscribe Model
Producers publish messages to topics; consumers subscribe to those topics to receive messages.
Partitioning
Messages are divided into partitions, enabling parallel processing and scalability.
4. Advantages and Disadvantages
Advantages
Reliability – high throughput and low latency message delivery.
Scalability – can add partitions to increase capacity.
Performance – supports large numbers of consumers with fast processing.
Disadvantages
Complexity – requires solid networking and server infrastructure and technical expertise to install and configure.
Latency – may experience higher latency under heavy load.
5. Deployment Methods
Kafka can be deployed by installing the server and client applications.
Install Kafka server – via binary download or Docker container.
Install client libraries – supports Java, Scala, Python, Go, C#, C++, etc.
6. Applications
Kafka is used for real‑time data processing, batch processing, log aggregation, and monitoring.
Real‑time Data Processing
Streams data between systems and enables aggregation, statistics, and reporting.
Batch Processing
Partitions allow storing messages for later batch jobs.
Log Tracking
Captures event logs in real time for analysis.
Monitoring
Publishes metrics for real‑time monitoring and analysis.
7. Sample Use Case: Real‑time Data Processing
Consumer Example
// Create Kafka consumer
Properties props = new Properties();
props.put("bootstrap.servers", "localhost:9092");
props.put("group.id", "test");
props.put("enable.auto.commit", "true");
props.put("auto.commit.interval.ms", "1000");
props.put("key.deserializer", "org.apache.kafka.common.serialization.StringDeserializer");
props.put("value.deserializer","org.apache.kafka.common.serialization.StringDeserializer");
Consumer<String, String> consumer = new KafkaConsumer<String, String>(props);
// Subscribe to topic
consumer.subscribe(Arrays.asList("my-topic"));
// Consume messages
while (true) {
ConsumerRecords<String, String> records = consumer.poll(100);
for (ConsumerRecord<String, String> record : records) {
System.out.printf("offset = %d, key = %s, value = %s%n", record.offset(), record.key(), record.value());
}
}
consumer.close();Producer Example
// Create Kafka producer
Properties props = new Properties();
props.put("bootstrap.servers", "localhost:9092");
props.put("key.serializer", "org.apache.kafka.common.serialization.StringSerializer");
props.put("value.serializer", "org.apache.kafka.common.serialization.StringSerializer");
Producer<String, String> producer = new KafkaProducer<String, String>(props);
// Publish messages
for (int i = 0; i < 10; i++) {
String msg = "Message " + i;
producer.send(new ProducerRecord<String, String>("my-topic", msg));
}
producer.close();Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Rare Earth Juejin Tech Community
Juejin, a tech community that helps developers grow.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
