Comprehensive Guide to Flink Partitioners and Their Implementations
This article explains the eight built‑in Flink partitioners, their distribution strategies, key implementation details, and provides Java code examples illustrating how each partitioner selects downstream channels and determines pointwise or all‑to‑all distribution.
Flink provides eight built‑in partitioners—RebalancePartitioner, RescalePartitioner, KeyGroupStreamPartitioner, GlobalPartitioner, ShufflePartitioner, ForwardPartitioner, BroadcastPartitioner, and CustomPartitionerWrapper—each extending the abstract StreamPartitioner class.
The partitioners are:
RebalancePartitioner RescalePartitioner KeyGroupStreamPartitioner GlobalPartitioner ShufflePartitioner ForwardPartitioner CustomPartitionerWrapper BroadcastPartitioner
1. RebalancePartitioner
Distributes records evenly by cycling through all downstream channels (all‑to‑all mode). The first channel is chosen randomly, then a round‑robin counter is incremented.
private int nextChannelToSendTo;
@Override
public void setup(int numberOfChannels) {
super.setup(numberOfChannels);
nextChannelToSendTo = ThreadLocalRandom.current().nextInt(numberOfChannels);
}
@Override
public int selectChannel(SerializationDelegate<StreamRecord<T>> record) {
nextChannelToSendTo = (nextChannelToSendTo + 1) % numberOfChannels;
return nextChannelToSendTo;
}
@Override
public boolean isPointwise() { return false; }2. RescalePartitioner
Chooses downstream channels based on the parallelism of upstream and downstream operators, providing pointwise distribution while allowing some local processing to reduce network I/O.
// round‑robin starting from channel 0
@Override
public int selectChannel(SerializationDelegate<StreamRecord<T>> record) {
if (++nextChannelToSendTo >= numberOfChannels) {
nextChannelToSendTo = 0;
}
return nextChannelToSendTo;
}
@Override
public boolean isPointwise() { return true; }3. GlobalPartitioner
Sends all records to the downstream subtask with ID 0 (all‑to‑all mode).
@Override
public int selectChannel(SerializationDelegate<StreamRecord<T>> record) { return 0; }
@Override
public boolean isPointwise() { return false; }4. ForwardPartitioner
Forwards records only to the locally running downstream subtask (pointwise mode).
@Override
public int selectChannel(SerializationDelegate<StreamRecord<T>> record) { return 0; }
@Override
public boolean isPointwise() { return true; }5. BroadcastPartitioner
Broadcasts each record to all downstream channels; channel selection is unsupported.
@Override
public int selectChannel(SerializationDelegate<StreamRecord<T>> record) {
throw new UnsupportedOperationException("Broadcast partitioner does not support select channels.");
}6. KeyGroupStreamPartitioner
Assigns records to channels based on a key group derived from the record’s key, using a two‑step hash (hashCode then MurmurHash) and modulo arithmetic.
public int selectChannel(SerializationDelegate<StreamRecord<T>> record) {
K key = keySelector.getKey(record.getInstance().getValue());
return KeyGroupRangeAssignment.assignKeyToParallelOperator(key, maxParallelism, numberOfChannels);
}7. ShufflePartitioner
Randomly selects a downstream channel for each record (all‑to‑all mode).
private Random random = new Random();
@Override
public int selectChannel(SerializationDelegate<StreamRecord<T>> record) {
return random.nextInt(numberOfChannels);
}
@Override
public boolean isPointwise() { return false; }8. CustomPartitionerWrapper
Allows users to provide a custom Partitioner and KeySelector to define arbitrary partitioning logic.
public <K> DataStream<T> partitionCustom(Partitioner<K> partitioner, KeySelector<T, K> keySelector) {
return setConnectionType(new CustomPartitionerWrapper<>(clean(partitioner), clean(keySelector)));
}These partitioners are used by Flink’s StreamingJobGraphGenerator to convert a StreamGraph into a JobGraph, where the isPointwise() method determines whether the edge uses DistributionPattern.POINTWISE or DistributionPattern.ALL_TO_ALL.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Big Data Technology & Architecture
Wang Zhiwu, a big data expert, dedicated to sharing big data technology.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
