Operations 21 min read

Mastering Load Balancing: Common Algorithms and Their Implementations

This article explains what load balancing technology is, reviews common load balancing algorithms such as Round Robin, Weighted Round Robin, Least Connections, and Consistent Hash, and demonstrates their implementations with code examples, while also covering extensions like service discovery, health checks, and slow start mechanisms.

21CTO
21CTO
21CTO
Mastering Load Balancing: Common Algorithms and Their Implementations

What is Load Balancing Technology

Load balancers are software or hardware devices that distribute network traffic across a group of servers to prevent any single server from being overloaded. Load balancing algorithms are the logical rules used by balancers to allocate traffic, ranging from simple round‑robin to adaptive algorithms based on response status.

Choosing the right algorithm affects performance and service continuity (SLA).

This article introduces common load‑balancing algorithms and shows how they are implemented in popular load‑balancing software or hardware.

Common Load‑Balancing Algorithms

Round Robin

The simplest and most widely used algorithm; client requests are distributed to servers in a rotating order.

Weighted Round Robin

Similar to round robin but assigns a weight to each server according to its capacity, allowing more powerful servers to receive proportionally more requests.

Weighted round robin describes the distribution over a period; different variants may produce different selection sequences.

Least Connections

Also called Least Outstanding Request; directs each request to the server with the fewest active connections.

Weighted Least Connections

Extends Least Connections by considering server weights; the balancer selects the server with the highest weight‑adjusted availability.

Resource‑Based (Adaptive)

Uses health‑check metrics reported by backend servers to compute dynamic weights, adjusting traffic based on real‑time resource usage.

Fixed Weighting

Administrators assign a static weight to each server; the highest‑weight server receives all traffic unless it fails, in which case traffic falls back to the next highest.

Weighted Response Time

Calculates server weight from its response time; the fastest server receives the next request, useful when response latency is critical.

Source IP Hash

Generates a hash from the client’s source IP (or X‑Forwarded‑For header) to consistently route a client to the same server.

Database transactions often use this scenario.

Consistent Hash

Similar to IP hash but can use any request attribute; when the server pool changes, only a minimal amount of traffic is remapped.

Implementation of Common Algorithms

Implementations assume linear request processing without concurrency concerns.

Round Robin Implementation

public class WeightedLoadBalancer {
    private final List instances;
    private int position;
    public WeightedLoadBalancer(List instances) {
        this.instances = expandByWeight(instances);
    }
    public ServiceInstance peek(HttpServletRequest request) {
        int peeked = (position++) & Integer.MAX_VALUE;
        return instances.get(peeked % instances.size());
    }
    private List expandByWeight(List instances) {
        List newInstances = new ArrayList<>();
        for (ServiceInstance instance : instances) {
            int bound = instance.getWeight();
            for (int w = 0; w < bound; w++) {
                newInstances.add(instance);
            }
        }
        Collections.shuffle(newInstances);
        return newInstances;
    }
}

Key points: handle large weights with GCD reduction, shuffle to avoid burstiness, and note that weight changes after construction are not supported.

Upper‑Bound Convergence Selection

Pre‑compute the maximum weight as the upper bound and iterate over instances, selecting those whose weight meets the current bound, then lower the bound until zero.

Weighted Round Robin (WRR) Implementation

public class WeightedLoadBalancer {
    private final List instances;
    public WeightedLoadBalancer(List instances) {
        this.instances = instances;
    }
    public ServiceInstance peek(HttpServletRequest request) {
        ServiceInstance best = null;
        int total = 0;
        for (ServiceInstance instance : instances) {
            total += instance.getWeight();
            instance.setCurrentWeight(instance.getCurrentWeight() + instance.getWeight());
            if (best == null || instance.getCurrentWeight() > best.getCurrentWeight()) {
                best = instance;
            }
        }
        if (best != null) {
            best.setCurrentWeight(best.getCurrentWeight() - total);
        }
        return best;
    }
}

Notes: suitable for highly dynamic instance sets, O(n) complexity, and requires mutable instance state.

Earliest Deadline First (EDF)

Originally a CPU scheduling algorithm; each server instance carries a deadline derived from its weight, and the balancer always picks the instance with the earliest deadline.

public class LeastConnectionLoadBalancer {
    private final List instances;
    public LeastConnectionLoadBalancer(List instances) {
        this.instances = instances;
    }
    public ServiceInstance peek(HttpServletRequest request) {
        ServiceInstance best = null;
        for (ServiceInstance instance : instances) {
            if (best == null || instance.getConnections() < best.getConnections()) {
                best = instance;
            }
        }
        if (best != null) {
            best.setConnections(best.getConnections() + 1);
        }
        return best;
    }
}

Heap‑Based Maintenance

All dynamic ordered collections can be implemented with a priority queue; after extracting the top element, its priority is updated and it is re‑inserted.

IP Hash Implementation

public class IpHashLoadBalancer {
    private final List instances;
    public IpHashLoadBalancer(List instances) {
        this.instances = instances;
    }
    public ServiceInstance peek(HttpServletRequest request) {
        int h = hashCode(request);
        return instances.get(h % instances.size());
    }
    private int hashCode(HttpServletRequest request) {
        String xForwardedFor = request.getHeader("X-Forwarded-For");
        if (xForwardedFor != null) {
            return xForwardedFor.hashCode();
        } else {
            return request.getRemoteAddr().hashCode();
        }
    }
}

When serving public traffic, the balancer may sit behind multiple reverse proxies; it must first check the X‑Forwarded‑For header to obtain the real client IP.

Load‑Balancing Extensions

Service Registry and Discovery

In large clusters, servers are created and removed dynamically. A service registry maintains a list of healthy instances, allowing balancers to fetch the list on demand and cache it locally for resilience.

Health Check

Health checks periodically probe each backend server (TCP connection for L4, protocol‑specific request for L7) and remove unhealthy instances from the pool.

Slow Start

Inspired by TCP congestion control, slow start gradually increases a newly‑started server’s weight to avoid overwhelming it while its JIT compiler and caches warm up.

Conclusion

Load‑balancing technology is a core component of network proxies and gateways. This article introduced the concept, surveyed common algorithms and their implementations, and discussed extensions such as service discovery, health checks, and slow start, providing a foundation for deeper study of proxy technologies.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Distributed Systemsload balancingAlgorithmshealth check
21CTO
Written by

21CTO

21CTO (21CTO.com) offers developers community, training, and services, making it your go‑to learning and service platform.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.