How to Minimize Data Movement When Scaling Kafka Replicas
This article explores strategies for batch scaling Kafka replicas with minimal data migration, presenting two design ideas, detailed calculations of broker lists, partition counts, start indexes, and replica shifts, and provides step‑by‑step algorithms and code snippets to compute optimal replica assignments for both expansion and contraction scenarios.
Background
When scaling Kafka replicas in bulk, using the default --generate reassignment algorithm can cause massive data migration because many partitions change their leader and follower brokers. The goal is to design a method that keeps the movement as small as possible.
Idea 1 – Minimal‑change reassignment
Kafka does not support direct replica scaling, but the kafka-reassign-partitions.sh tool can be used. Manually configuring each topic’s replica placement is error‑prone and leads to unbalanced assignments. The idea is to compute a new assignment based on the existing broker‑to‑replica mapping while only changing the replica count.
Key variables
BrokerList – the ordered list of brokers participating in the assignment.
Partition count – total number of partitions.
Replica count – desired number of replicas per partition.
startIndex – the index of the first replica of the first partition in BrokerList.
nextReplicaShift – a random offset used to compute the position of the second replica relative to the first.
Step‑by‑step calculation (case: no prior partition expansion)
Read the current replica assignment from ZooKeeper (example JSON shown).
Derive BrokerList by taking the first replica of each partition and arranging them so that a contiguous block of brokers covers all brokers (e.g., {2,3,0,1,4}).
Set partition count = 10 and replica count = 3.
Compute startIndex by locating the first replica of partition 0 in BrokerList (here it is 0).
Determine nextReplicaShift by examining the offset between the first and second replicas of the first few partitions (example value = 3).
With these parameters, calling AdminUtils.assignReplicasToBrokersRackUnaware reproduces the original assignment; changing only the replica count yields a minimal‑change reassignment. If the original assignment was manually specified, this method cannot be used and a full recomputation is required.
Idea 2 – Simple sequential shift
This approach ignores most of the variables from Idea 1. For each partition, the last replica is moved to the next available broker (or removed when shrinking). It works regardless of whether partitions were previously expanded or manually assigned.
Replica expansion example
{"0":[0,1,4] => [0,1,4,2]
"1":[1,4,2] => [1,4,2,3]
"2":[4,2,3] => [4,2,3,1]
"3":[3,4,0] => [3,4,0,1]
"4":[4,0,1] => [4,0,1,2]
"5":[0,2,3] => [0,2,3,4]}When only a single replica exists, the algorithm simply sorts partitions by number, builds a BrokerList, and applies the same sequential shift.
Replica shrinking example
{"0":[0,1,4] => [0,1]
"1":[1,4,2] => [1,4]
"2":[4,2,3] => [4,2]
"3":[3,4,0] => [3,4]
"4":[4,0,1] => [4,0]
"5":[0,2,3] => [0,2]}The method also includes a safeguard: if the last replica of many partitions resides on the same broker, removing them all at once could unbalance the cluster. The algorithm checks for such cases and shifts the removal to the previous broker when necessary.
Comparison
Idea 1 handles many edge cases and preserves existing placements but requires complex calculations and may interfere with custom assignments.
Idea 2 is straightforward, works with any existing assignment, and changes only the newly added replicas, keeping data movement minimal.
Final solution
The chosen implementation uses Idea 2 by default, falling back to Idea 1 when the original replica count equals 1 or when specific conditions demand the more precise calculation.
Implementation note
The prototype is planned to be visualized with LogIKM; the actual code has not been released yet.
Images
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
dbaplus Community
Enterprise-level professional community for Database, BigData, and AIOps. Daily original articles, weekly online tech talks, monthly offline salons, and quarterly XCOPS&DAMS conferences—delivered by industry experts.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
