Big Data 2 min read

Kafka FAQs: Zookeeper Dependency, Retention Policies, Cleanup Rules, Performance Bottlenecks, and Cluster Best Practices

This article answers common Kafka questions, explaining why Kafka cannot operate without Zookeeper, describing its two retention strategies based on time and size, detailing how simultaneous time‑ and size‑based cleanup works, listing performance bottlenecks, and offering practical guidelines for sizing and configuring Kafka clusters.

Java Captain

Apr 9, 2019

Kafka FAQs: Zookeeper Dependency, Retention Policies, Cleanup Rules, Performance Bottlenecks, and Cluster Best Practices

152. Can Kafka run without Zookeeper? Why?

Kafka cannot run without Zookeeper because it uses Zookeeper to manage and coordinate its broker nodes.

153. How many data retention strategies does Kafka have?

Kafka has two data retention strategies: retention by expiration time and retention by total message size.

154. If both a 7‑day and a 10 GB retention limit are set and the data reaches 10 GB on the fifth day, how does Kafka handle it?

Kafka will trigger data cleanup as soon as either condition is met, so it will delete data once the size limit is reached, regardless of the time limit.

155. What situations can cause Kafka to slow down?

CPU performance bottlenecks

Disk I/O bottlenecks

Network bottlenecks

156. What should be considered when using a Kafka cluster?

Having too many brokers can increase replication latency and reduce overall throughput; it is recommended to keep the cluster size ≤ 7.

Prefer an odd number of brokers so that a majority can survive failures, improving fault tolerance.

(End)

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Big Data Kafka data retention Cluster Design

Written by

Java Captain

Focused on Java technologies: SSM, the Spring ecosystem, microservices, MySQL, MyCat, clustering, distributed systems, middleware, Linux, networking, multithreading; occasionally covers DevOps tools like Jenkins, Nexus, Docker, ELK; shares practical tech insights and is dedicated to full‑stack Java development.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.