Mastering Load Balancing: Common Algorithms and Their Implementations
This article explains what load balancing technology is, reviews common load balancing algorithms such as Round Robin, Weighted Round Robin, Least Connections, and Consistent Hash, and demonstrates their implementations with code examples, while also covering extensions like service discovery, health checks, and slow start mechanisms.
What is Load Balancing Technology
Load balancers are software or hardware devices that distribute network traffic across a group of servers to prevent any single server from being overloaded. Load balancing algorithms are the logical rules used by balancers to allocate traffic, ranging from simple round‑robin to adaptive algorithms based on response status.
Choosing the right algorithm affects performance and service continuity (SLA).
This article introduces common load‑balancing algorithms and shows how they are implemented in popular load‑balancing software or hardware.
Common Load‑Balancing Algorithms
Round Robin
The simplest and most widely used algorithm; client requests are distributed to servers in a rotating order.
Weighted Round Robin
Similar to round robin but assigns a weight to each server according to its capacity, allowing more powerful servers to receive proportionally more requests.
Weighted round robin describes the distribution over a period; different variants may produce different selection sequences.
Least Connections
Also called Least Outstanding Request; directs each request to the server with the fewest active connections.
Weighted Least Connections
Extends Least Connections by considering server weights; the balancer selects the server with the highest weight‑adjusted availability.
Resource‑Based (Adaptive)
Uses health‑check metrics reported by backend servers to compute dynamic weights, adjusting traffic based on real‑time resource usage.
Fixed Weighting
Administrators assign a static weight to each server; the highest‑weight server receives all traffic unless it fails, in which case traffic falls back to the next highest.
Weighted Response Time
Calculates server weight from its response time; the fastest server receives the next request, useful when response latency is critical.
Source IP Hash
Generates a hash from the client’s source IP (or X‑Forwarded‑For header) to consistently route a client to the same server.
Database transactions often use this scenario.
Consistent Hash
Similar to IP hash but can use any request attribute; when the server pool changes, only a minimal amount of traffic is remapped.
Implementation of Common Algorithms
Implementations assume linear request processing without concurrency concerns.
Round Robin Implementation
public class WeightedLoadBalancer {
private final List instances;
private int position;
public WeightedLoadBalancer(List instances) {
this.instances = expandByWeight(instances);
}
public ServiceInstance peek(HttpServletRequest request) {
int peeked = (position++) & Integer.MAX_VALUE;
return instances.get(peeked % instances.size());
}
private List expandByWeight(List instances) {
List newInstances = new ArrayList<>();
for (ServiceInstance instance : instances) {
int bound = instance.getWeight();
for (int w = 0; w < bound; w++) {
newInstances.add(instance);
}
}
Collections.shuffle(newInstances);
return newInstances;
}
}Key points: handle large weights with GCD reduction, shuffle to avoid burstiness, and note that weight changes after construction are not supported.
Upper‑Bound Convergence Selection
Pre‑compute the maximum weight as the upper bound and iterate over instances, selecting those whose weight meets the current bound, then lower the bound until zero.
Weighted Round Robin (WRR) Implementation
public class WeightedLoadBalancer {
private final List instances;
public WeightedLoadBalancer(List instances) {
this.instances = instances;
}
public ServiceInstance peek(HttpServletRequest request) {
ServiceInstance best = null;
int total = 0;
for (ServiceInstance instance : instances) {
total += instance.getWeight();
instance.setCurrentWeight(instance.getCurrentWeight() + instance.getWeight());
if (best == null || instance.getCurrentWeight() > best.getCurrentWeight()) {
best = instance;
}
}
if (best != null) {
best.setCurrentWeight(best.getCurrentWeight() - total);
}
return best;
}
}Notes: suitable for highly dynamic instance sets, O(n) complexity, and requires mutable instance state.
Earliest Deadline First (EDF)
Originally a CPU scheduling algorithm; each server instance carries a deadline derived from its weight, and the balancer always picks the instance with the earliest deadline.
public class LeastConnectionLoadBalancer {
private final List instances;
public LeastConnectionLoadBalancer(List instances) {
this.instances = instances;
}
public ServiceInstance peek(HttpServletRequest request) {
ServiceInstance best = null;
for (ServiceInstance instance : instances) {
if (best == null || instance.getConnections() < best.getConnections()) {
best = instance;
}
}
if (best != null) {
best.setConnections(best.getConnections() + 1);
}
return best;
}
}Heap‑Based Maintenance
All dynamic ordered collections can be implemented with a priority queue; after extracting the top element, its priority is updated and it is re‑inserted.
IP Hash Implementation
public class IpHashLoadBalancer {
private final List instances;
public IpHashLoadBalancer(List instances) {
this.instances = instances;
}
public ServiceInstance peek(HttpServletRequest request) {
int h = hashCode(request);
return instances.get(h % instances.size());
}
private int hashCode(HttpServletRequest request) {
String xForwardedFor = request.getHeader("X-Forwarded-For");
if (xForwardedFor != null) {
return xForwardedFor.hashCode();
} else {
return request.getRemoteAddr().hashCode();
}
}
}When serving public traffic, the balancer may sit behind multiple reverse proxies; it must first check the X‑Forwarded‑For header to obtain the real client IP.
Load‑Balancing Extensions
Service Registry and Discovery
In large clusters, servers are created and removed dynamically. A service registry maintains a list of healthy instances, allowing balancers to fetch the list on demand and cache it locally for resilience.
Health Check
Health checks periodically probe each backend server (TCP connection for L4, protocol‑specific request for L7) and remove unhealthy instances from the pool.
Slow Start
Inspired by TCP congestion control, slow start gradually increases a newly‑started server’s weight to avoid overwhelming it while its JIT compiler and caches warm up.
Conclusion
Load‑balancing technology is a core component of network proxies and gateways. This article introduced the concept, surveyed common algorithms and their implementations, and discussed extensions such as service discovery, health checks, and slow start, providing a foundation for deeper study of proxy technologies.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
21CTO
21CTO (21CTO.com) offers developers community, training, and services, making it your go‑to learning and service platform.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
