How to Fix Uneven Data Distribution in Multi‑Threaded Backend Systems

This article analyzes a production‑line data‑distribution issue where parallel processing and rule‑based assignment cause some employees to receive no tasks, explains why simply increasing batch size fails, and presents a Redis‑counter solution that balances efficiency with fair workload distribution.

Selected Java Interview Questions
Selected Java Interview Questions
Selected Java Interview Questions
How to Fix Uneven Data Distribution in Multi‑Threaded Backend Systems

Panic

On a Monday the author noticed a flood of messages in the company chat, suspecting a production line issue.

Root Cause

The system periodically assigns data to employees based on a set of prioritized rules; if none match, a fallback list of employee IDs is used and data is distributed evenly among them.

To improve efficiency, the system processes data in parallel, splitting every n items into a batch, creating k batches that are handled concurrently.

If the employee list length is m, each employee is expected to receive at least ⌊n/m⌋ items, with n % m employees receiving one extra item; with k batches this imbalance can be amplified by k.

When k = 1 and n < m, only a subset of employees handle data.

When k is large, some employees consistently handle more data than others.

In practice, m often exceeds n, leaving m − n employees without any data. A quick fix was to increase n so that n > m, assuming this would guarantee coverage.

However, when daily data volume is low, increasing n does not help because the total data may still be less than n, and the same uneven distribution persists.

Root Problem

The core issue is that the parallel processing logic assigns data based on per‑thread averages without considering overall fairness.

Before deployment the team recognized this but dismissed it, assuming high data volume and avoiding locking mechanisms that could hurt performance.

Current Solution

The implemented solution uses Redis as a counter. Each time a piece of data is assigned, the counter is incremented, and the modulo of the counter with the employee list length determines the target employee index, achieving both efficiency and average fairness.

Conclusion

While the solution appears straightforward, it highlights a common cognitive blind spot: seemingly simple fixes can hide deeper architectural trade‑offs.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

load balancingredisdata distribution
Selected Java Interview Questions
Written by

Selected Java Interview Questions

A professional Java tech channel sharing common knowledge to help developers fill gaps. Follow us!

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.