Why Kafka? Deep Dive into Architecture, Performance, and Production Deployment
This article explains the need for a messaging system, explores Kafka's core concepts, cluster architecture, performance optimizations like sequential disk writes and zero‑copy, and provides detailed guidance on sizing hardware, configuring producers and consumers, and managing a production Kafka deployment.
Why Use a Messaging System
Messaging decouples services, enables asynchronous processing, and controls traffic spikes such as e‑commerce flash‑sale events. Typical steps (risk control, inventory lock, order creation, SMS notification, data update) can be split so that non‑critical steps are handled later via a message queue, improving latency and throughput.
Kafka Core Concepts
Producer : Sends data to the Kafka cluster.
Consumer : Pulls data from Kafka for processing.
Topic and Partition : Logical topics are divided into partitions stored as directories on disk.
Kafka stores data in log files; each partition may have multiple log segments (default 1 GB each). Sequential disk writes give high write performance, while a sparse index enables fast binary‑search reads.
Cluster Architecture
Each Kafka broker hosts partitions; a Controller node (elected via ZooKeeper) manages metadata. Consumer groups share a group.id; members of the same group coordinate via a coordinator broker to balance partition assignments.
Performance Mechanisms
Sequential Disk Writes : Data is appended, avoiding random writes.
Zero‑Copy (sendfile) : Consumers read from OS cache directly to the network card, reducing data copies.
Log Segmentation and Indexing
Each partition creates a directory (e.g., topic_a-0) containing .log files. Kafka writes an index entry every 4 KB of log data; binary search on this sparse index locates messages quickly.
High‑Concurrency Network Design
Kafka uses a multi‑selector, multi‑thread, queue‑based NIO architecture (Reactor pattern) to handle millions of requests per second.
Redundancy and High Availability
Partitions have replicas (leader + followers).
ISR (in‑sync replica) list tracks replicas that are up‑to‑date.
Writes are considered committed when all ISR members acknowledge (configurable via acks and min.insync.replicas).
Production Deployment Planning
Example scenario: 1 billion daily requests, peak QPS ≈ 55 k, total data ≈ 46 TB per day (×2 replicas, 3‑day retention → 276 TB). Recommended hardware:
5 physical servers, each with 11 × 7 TB disks (≈ 77 TB per server, total ≈ 385 TB).
64 GB RAM (≥ 60 GB for OS cache, plus JVM heap).
16 CPU cores (32 cores preferred).
10 Gbps network cards (1 Gbps may be insufficient for peak traffic).
Cluster planning includes estimating request volume, number of servers, disk type (SSD vs. SAS), memory allocation, CPU cores, and network bandwidth.
Operations Tools and Commands
Common utilities: KafkaManager for UI management, KafkaOffsetMonitor for offset monitoring.
kafka-topics.sh --create --zookeeper hadoop1:2181,hadoop2:2181,hadoop3:2181 \
--replication-factor 1 --partitions 1 --topic test6
kafka-topics.sh --alter --zookeeper hadoop1:2181,hadoop2:2181,hadoop3:2181 \
--partitions 3 --topic test6Reassign partitions, increase replication factor, and balance load using scripts like kafka-reassign-partitions.sh and kafka-reassign-partitions.sh --generate/--execute/--verify.
Producer Configuration and Tuning
buffer.memory: default 32 MB. compression.type: none or lz4. batch.size: default 16 KB; larger batches increase throughput but add latency. linger.ms: default 0 ms; setting e.g., 100 ms allows batching.
Exception handling includes LeaderNotAvailableException, NotControllerException, and NetworkException. Retries can cause duplicate or out‑of‑order messages; setting max.in.flight.requests.per.connection=1 preserves order.
Consumer Configuration and Offset Management
Offsets are stored in the internal topic __consumer_offsets (default 50 partitions). Key format: group.id+topic+partition. Important consumer parameters: fetch.max.bytes (default 1 MB, can be increased). max.poll.records (default 500). enable.auto.commit and auto.commit.interval.ms. auto.offset.reset: earliest, latest, none.
Rebalance strategies: range , round‑robin , and sticky . Coordinator selection uses a hash of group.id modulo the number of __consumer_offsets partitions.
Broker Management Details
LEO (log end offset) tracks the latest offset; HW (high water mark) indicates the highest offset replicated to all ISR members and is visible to consumers.
Controller monitors broker registration ( /broker/ids/) and topic creation ( /broker/topics/), and handles partition reassignment ( /admin/reassign_partitions).
Delayed Tasks and Time‑Wheel Scheduler
Kafka uses a custom time‑wheel for delayed operations (e.g., waiting for follower replicas, empty fetch requests). The wheel provides O(1) insertion and removal, with multiple hierarchical levels to handle tasks ranging from milliseconds to seconds.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Code Ape Tech Column
Former Ant Group P8 engineer, pure technologist, sharing full‑stack Java, job interview and career advice through a column. Site: java-family.cn
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
