Big Data 8 min read

Understanding Kafka's ZooKeeper Paths and Their Stored Metadata

This article explains how ZooKeeper stores Kafka's coordination data by detailing the predefined ZK paths, the JSON structures for broker, topic, partition, controller, and consumer information, and the auxiliary nodes used for replica election and partition reassignment.

Big Data Technology & Architecture
Big Data Technology & Architecture
Big Data Technology & Architecture
Understanding Kafka's ZooKeeper Paths and Their Stored Metadata

ZooKeeper is a widely used distributed coordination service that plays a foundational role for many big‑data components such as HDFS, YARN, HBase, and Kafka. This article briefly describes how ZooKeeper stores information related to Kafka.

At the beginning of the kafka.utils.ZkUtils object, a series of ZK paths are defined:

val AdminPath = "/admin"
val BrokersPath = "/brokers"
val ClusterPath = "/cluster"
val ConfigPath = "/config"
val ControllerPath = "/controller"
val ControllerEpochPath = "/controller_epoch"
val IsrChangeNotificationPath = "/isr_change_notification"
val LogDirEventNotificationPath = "/log_dir_event_notification"
val KafkaAclPath = "/kafka-acl"
val KafkaAclChangesPath = "/kafka-acl-changes"

val ConsumersPath = "/consumers"
val ClusterIdPath = s"$ClusterPath/id"
val BrokerIdsPath = s"$BrokersPath/ids"
val BrokerTopicsPath = s"$BrokersPath/topics"
val ReassignPartitionsPath = s"$AdminPath/reassign_partitions"
val DeleteTopicsPath = s"$AdminPath/delete_topics"
val PreferredReplicaLeaderElectionPath = s"$AdminPath/preferred_replica_election"
val BrokerSequenceIdPath = s"$BrokersPath/seqid"
val ConfigChangesPath = s"$ConfigPath/changes"
val ConfigUsersPath = s"$ConfigPath/users"
val ProducerIdBlockPath = "/latest_producer_id_block"

Broker registration information is stored under the path /brokers/ids/[broker_id]. An example JSON payload looks like:

{
  "listener_security_protocol_map": {"PLAINTEXT": "PLAINTEXT"},
  "endpoints": ["PLAINTEXT://hadoop7:9092"],
  "jmx_port": 9393,
  "host": "hadoop7",
  "timestamp": "1554349917296",
  "port": 9092,
  "version": 4
}

The fields indicate the JMX port, host name or IP, startup timestamp, TCP port, and version number. This node updates whenever a broker joins or leaves the cluster.

Topic registration information resides at /brokers/topics/[topic_name]. Example data:

{
  "version": 1,
  "partitions": {
    "8": [103], "4": [109], "9": [104], "5": [110], "6": [111],
    "1": [106], "0": [105], "2": [107], "7": [102], "3": [108]
  }
}

The partitions map lists each partition ID and the IDs of brokers that are in the ISR for that partition. Changes occur when topics are created, deleted, or when partition assignments change.

Partition state information is kept at /brokers/topics/[topic_name]/partitions/[partition_id]/state. Example:

{
  "controller_epoch": 17,
  "leader": 105,
  "version": 1,
  "leader_epoch": 2,
  "isr": [105]
}

It records the controller epoch, current leader broker ID, leader epoch, and the ISR list for the partition.

Controller registration information is stored at /controller with a payload such as:

{
  "version": 1,
  "brokerid": 104,
  "timestamp": "1554349916898"
}

The brokerid identifies the current controller node, and the timestamp marks the last controller change.

Consumer subscription information lives under /consumers/[group_id]/ids/[consumer_id]. Example JSON:

{
  "version": 1,
  "subscription": {"bl_mall_orders": 1},
  "pattern": "white_list",
  "timestamp": "1558617131642"
}

This node records the topics a consumer subscribes to, the subscription pattern (static, white‑list, or black‑list), and the creation timestamp. Consumer group offsets and owners are stored in /consumers/[group_id]/offsets/… and /consumers/[group_id]/owners/… respectively.

Preferred replica election information is created at /admin/preferred_replica_election when the kafka-preferred-replica-election tool is used. Example:

{
  "version": 1,
  "partitions": [
    {"topic": "bl_mall_orders", "partition": 1},
    {"topic": "bl_mall_products", "partition": 0}
  ]
}

Partition reassignment information is stored at /admin/reassign_partitions for the kafka-reassign-partitions tool. Example:

{
  "version": 1,
  "partitions": [
    {"topic": "bl_mall_wish", "partition": 1, "replicas": [0, 1, 3]}
  ]
}

When ISR sets change, the controller writes the affected partition to /isr_change_notification/[isr_change_x], although the exact data format is not documented.

These ZK nodes together enable Kafka to manage broker registration, topic metadata, partition leadership, consumer groups, and automated replica balancing.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Big DataKafkaBroker metadataTopic metadataZK paths
Big Data Technology & Architecture
Written by

Big Data Technology & Architecture

Wang Zhiwu, a big data expert, dedicated to sharing big data technology.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.