Mastering Load Balancing: Algorithms, Code Samples, and Real‑World Insights
This article explains the concept of load balancing in distributed systems, outlines its benefits for throughput and reliability, compares common architectural layers, evaluates key algorithmic considerations, and provides Python implementations of round‑robin, weighted, random, hash‑based, and least‑connection strategies along with deployment options.
Understanding Load Balancing
Load balancing distributes requests evenly across a group of homogeneous servers or processes, enabling a single Internet service to be provided by multiple backend nodes (server farm, server pool). It improves system throughput, reduces response time, and enhances reliability by preventing overload and single‑point failures.
Typical Distributed Architecture
A typical web architecture consists of client, reverse‑proxy (e.g., Nginx), site, service, and data layers. Each downstream layer may have multiple upstream instances, and the goal is to ensure every upstream accesses each downstream uniformly.
Client → Reverse‑proxy: DNS round‑robin
Reverse‑proxy → Site: Nginx
Site → Service: connection pool
Data layer: range‑based or hash‑based partitioning
Algorithm Evaluation Criteria
When choosing a load‑balancing algorithm, consider:
Differences in node capacities (CPU, memory, network, location)
Dynamic changes in node performance
Stateful services that require the same client to hit the same node
Who acts as the balancer and whether it can become a bottleneck
Load‑Balancing Algorithms
Round‑Robin
SERVER_LIST = ['10.246.10.1', '10.246.10.2', '10.246.10.3']
def round_robin(server_lst, cur=[0]):
length = len(server_lst)
ret = server_lst[cur[0] % length]
cur[0] = (cur[0] + 1) % length
return retThis simple method gives each node an equal chance, ignoring capacity differences.
Weighted Round‑Robin
WEIGHT_SERVER_LIST = {'10.246.10.1': 1, '10.246.10.2': 3, '10.246.10.3': 2}
def weight_round_robin(servers, cur=[0]):
weighted_list = []
for k, v in servers.items():
weighted_list.extend([k] * v)
length = len(weighted_list)
ret = weighted_list[cur[0] % length]
cur[0] = (cur[0] + 1) % length
return retAssigns more requests to higher‑capacity nodes.
Random Selection
import random
def random_choose(server_lst):
random.seed()
return random.choice(server_lst)Weighted Random
def weight_random_choose(servers):
weighted_list = []
for k, v in servers.items():
weighted_list.extend([k] * v)
return random.choice(weighted_list)Hash‑Based Selection
def hash_choose(request_info, server_lst):
hashed = hash(request_info)
return server_lst[hashed % len(server_lst)]Maps a request (e.g., client IP) to a specific node, useful for stateful services.
Consistent Hashing
Improves hash‑based selection by mapping physical nodes to multiple virtual nodes, reducing remapping when nodes are added or removed.
Least Connection
Chooses the node with the fewest active connections, dynamically adapting to real‑time load.
Stateful Request Handling
For services that keep session state, ensure the same client is routed to the same backend (using consistent hashing or range partitioning) or share state via a common datastore (e.g., Redis, Memcached) or client‑side storage such as cookies.
Where to Place the Load Balancer
Two main approaches:
Client‑side balancing : Clients receive a server list and select a node locally, suitable for simple algorithms.
Proxy‑side balancing : A dedicated load‑balancer (e.g., Nginx, F5, LVS) sits before the server pool, handling complex algorithms and providing a single entry point.
Example with gRPC: the balancer queries server load, and the client connects directly to the chosen server.
Proxy‑based solutions (e.g., Nginx at layer 7, LVS at layer 4) centralize control but can become bottlenecks; high‑availability designs use active‑passive pairs.
Push vs. Pull Models
Traditional load balancing is a push model (the balancer pushes requests to a node). A pull model uses a message queue where idle workers pull tasks, achieving natural load distribution but adding latency.
References
Wiki: Load balancing; "一分钟了解负载均衡的一切"; gRPC load‑balancing documentation; Jobbole article.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
ITFLY8 Architecture Home
ITFLY8 Architecture Home - focused on architecture knowledge sharing and exchange, covering project management and product design. Includes large-scale distributed website architecture (high performance, high availability, caching, message queues...), design patterns, architecture patterns, big data, project management (SCRUM, PMP, Prince2), product design, and more.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
