Cloud Native 24 min read

Deep Dive into Ribbon: Architecture, Components, and Production Configuration

This article provides a comprehensive analysis of Ribbon, the client‑side load‑balancing component of Spring Cloud Netflix, covering its role in microservice calls, core modules, source‑code walkthroughs, health‑check mechanisms, zone‑aware strategies, and real‑world production configuration.

Yang Money Pot Technology Team

May 5, 2022

Deep Dive into Ribbon: Architecture, Components, and Production Configuration

1. Introduction

Ribbon is a member of the Spring Cloud Netflix suite, responsible for client‑side load balancing. While developers often work with Eureka, Hystrix, and Feign, Ribbon is less familiar. This article analyzes the source code to help readers deeply understand Ribbon's working principles and apply it effectively in practice.

2. What is Ribbon

In a microservice architecture, a service call can be simplified into three steps (ignoring cache, etc.):

Service discovery: obtain all server addresses from Eureka; typically multiple servers provide the same service, so the result is a list of servers.

Load balancing: select one server from the returned list.

Remote invocation: use Feign's dynamic proxy to call the selected server.

The second step is performed by Ribbon. Although client‑side load balancing may seem simple, its complexity in microservices lies in two aspects:

Multi‑AZ + Multi‑Region Production Environments

A Region refers to a physical data center. An Availability Zone (AZ) is an independent physical area within the same region, with separate power and network. AZs in the same region are connected by low‑latency links.

If client and server are in different AZs/regions, network latency varies:

Within the same AZ, latency is negligible.

Across AZs in the same region, latency can reach ~1 ms, depending on physical distance.

Across different regions, latency can reach tens of milliseconds.

Physical note: light travels at 300 km/ms in vacuum and about 200 km/ms in fiber. A 100 km fiber link adds roughly 0.5 ms, plus hardware processing delay, resulting in ~1 ms latency between AZs; inter‑region latency can be dozens of milliseconds.

For disaster recovery, most Internet companies deploy services across multiple AZs and regions. In our company, online services run in three AZs of the Alibaba Cloud North China region, so a robust load‑balancing mechanism must aim to keep calls within the same AZ to minimize latency.

High Availability

In microservices, a server may become unavailable due to network issues, overload, or cascading failures from dependent services. Load balancing must be able to filter out unavailable servers promptly.

3. Main Components of Ribbon

Ribbon's core modules consist of four parts:

Rule – the specific load‑balancing strategy (e.g., round‑robin).

Ping – the health‑check module; Ribbon starts a background thread to perform health checks.

ServerList – a static or dynamic list of servers; if dynamic, Ribbon periodically updates it from a data source such as Eureka.

LoadBalancer – integrates the above modules to provide complete load‑balancing service.

We will use LoadBalancer as the entry point and, together with source code, dissect each module's operation and related parameters.

3.1 ILoadBalancer

The ILoadBalancer interface defines the methods that all LoadBalancer subclasses must implement (excerpt shown below):

// Parameter key can be any object; its usage depends on subclass implementation
Server chooseServer(Object key)
List<Server> getReachableServers()
List<Server> getAllServers()

The most important method is chooseServer, which is the core of load balancing.

3.2 BaseLoadBalancer

BaseLoadBalancer is a basic implementation of ILoadBalancer, allowing users to customize health‑check mechanisms (IPing & IPingStrategy) and load‑balancing strategies (IRule).

To understand BaseLoadBalancer, we first need to understand IRule and IPing/IPingStrategy.

3.2.1 IRule

IRule represents a load‑balancing rule or strategy, such as round‑robin or response‑time‑based weighting. Its interface is simple:

public Server choose(Object key);
// rule usually needs statistics from the LoadBalancer
public void setLoadBalancer(ILoadBalancer lb);
public ILoadBalancer getLoadBalancer();

Common implementations include:

RoundRobinRule – classic round‑robin strategy, often the default.

AvailabilityFilteringRule – filters out unavailable nodes on top of round‑robin.

WeightedResponseTimeRule – assigns weights based on response time, achieving weighted round‑robin.

Our production environment uses AvailabilityFilteringRule. Its source code is shown below:

public Server choose(Object key) {
    int count = 0;
    Server server = roundRobinRule.choose(key); // 1
    while (count++ <= 10) { // 3
        if (predicate.apply(new PredicateKey(server))) { // 2
            return server;
        }
        server = roundRobinRule.choose(key);
    }
    return super.choose(key);
}

Explanation:

Obtain the next server according to the round‑robin rule.

Use a predicate to determine whether the server is available.

If available, return it immediately.

Otherwise continue round‑robin and re‑check.

If no available server is found after 10 attempts, fall back to the parent class's algorithm to guarantee a quick return.

The predicate is an AvailabilityPredicate whose apply method encapsulates the actual availability logic:

public boolean apply(@Nullable PredicateKey input) {
    LoadBalancerStats stats = getLBStats(); // 1
    if (stats == null) {
        return true;
    }
    return !shouldSkipServer(stats.getSingleServerStat(input.getServer())); // 2
}

private boolean shouldSkipServer(ServerStats stats) {
    if (stats.isCircuitBreakerTripped() || stats.getActiveRequestsCount() >= activeConnectionsLimit.get()) { // 3
        return true;
    }
    return false;
}

Explanation:

Obtain runtime statistics of the load balancer.

Retrieve statistics for the specific server.

A server is considered unavailable if its circuit breaker is tripped or its active request count exceeds the configured limit.

3.2.2 IPing

IPing defines the health‑check interface that subclasses must implement:

boolean isAlive(Server server);

Common implementations:

DummyPing – does nothing.

PingUrl – sends an HTTP request to a configured URL (standard health‑check pattern).

NIWDiscoveryPing – relies on the service‑discovery cache (e.g., Eureka) to determine liveness, which is more efficient than real HTTP calls.

3.2.3 IPingStrategy

IPingStrategy defines how to ping a list of servers given an IPing implementation:

boolean[] pingServers(IPing ping, Server[] servers);

Only implementation provided by Ribbon is SerialPingStrategy, which pings servers sequentially:

public boolean[] pingServers(IPing ping, Server[] servers) {
    int numCandidates = servers.length;
    boolean[] results = new boolean[numCandidates];

    for (int i = 0; i < numCandidates; i++) {
        results[i] = false; // Default answer is DEAD.
        try {
            if (ping != null) {
                results[i] = ping.isAlive(servers[i]);
            }
        } catch (Exception e) {
            logger.error("Exception while pinging Server: '{}'", servers[i], e);
        }
    }
    return results;
}

Note: SerialPingStrategy should not be paired with PingUrl because each ping would trigger a real network call, which can be slow for many servers. Netflix’s production uses NIWDiscoveryPing, avoiding actual network calls.

Ribbon separates health‑check functionality into IPing (what to ping) and IPingStrategy (how to ping). This design reduces class explosion: with M IPing subclasses and N strategies, you need only M+N classes instead of M×N.

3.2.4 BaseLoadBalancer Summary

Key fields of BaseLoadBalancer:

// injected via constructor
IRule rule;
IPingStrategy pingStrategy;
IPing ping;
// dynamically maintained server lists
volatile List<Server> allServerList;
volatile List<Server> upServerList;

The core method chooseServer simply delegates to the rule:

public Server chooseServer(Object key) {
    if (rule == null) {
        return null;
    } else {
        try {
            return rule.choose(key);
        } catch (Exception e) {
            logger.warn("LoadBalancer [{}]: Error choosing server for key {}", name, key, e);
            return null;
        }
    }
}

Health‑check is performed by a periodic task started in the constructor ( setupPingTask()). Simplified logic:

if (!pingInProgress.compareAndSet(false, true)) { return; } // 1
boolean[] results = null;
try {
    allServers = allServerList.toArray(new Server[allServerList.size()]);
    int numCandidates = allServers.length;
    results = pingStrategy.pingServers(ping, allServers); // 2
    List<Server> newUpList = new ArrayList<>();
    for (int i = 0; i < numCandidates; i++) {
        if (results[i]) {
            newUpList.add(allServers[i]);
        }
    }
    upServerList = newUpList; // 3
} finally {
    pingInProgress.set(false);
}

Explanation:

AtomicBoolean prevents overlapping executions.

Ping strategy returns availability for each server.

Update upServerList with the servers that are alive.

3.3 DynamicServerListLoadBalancer

DynamicServerListLoadBalancer extends BaseLoadBalancer, adding the ability to fetch server lists from a dynamic source and filter them using ServerList and ServerListFilter.

3.3.1 ServerList

ServerList represents a source of servers (e.g., Eureka or a config file). Interface:

public List<T> getInitialListOfServers();
public List<T> getUpdatedListOfServers();

Typical implementations:

ConfigurationBasedServerList – loads servers from Ribbon configuration ( <clientName>.<nameSpace>.listOfServers).

DiscoveryEnabledNIWSServerList – obtains servers from Eureka (

<clientName>.<nameSpace>.DeploymentContextBasedVipAddresses

3.3.2 ServerListFilter

ServerListFilter has a single method:

List<T> getFilteredListOfServers(List<T> servers);

Common implementations:

ZoneAffinityServerListFilter – filters out servers that are not in the specified zone.

ServerListSubsetFilter – builds on ZoneAffinityServerListFilter and randomly keeps a subset of servers to save resources.

Example: ZoneAffinityServerListFilter source code (core method):

public List<T> getFilteredListOfServers(List<T> servers) {
    if (zone != null && servers != null && servers.size() > 0) { // 1
        List<T> filteredServers = Lists.newArrayList(Iterables.filter(
                servers, this.zoneAffinityPredicate.getServerOnlyPredicate())); // 2
        if (shouldEnableZoneAffinity(filteredServers)) { // 3
            return filteredServers;
        }
    }
    return servers; // 4
}

Explanation:

Check that a zone is configured and the server list is non‑empty.

Filter out servers not belonging to the configured zone.

If the filtered list satisfies availability criteria, return it.

Otherwise fall back to the original list.

Method shouldEnableZoneAffinity decides whether the filtered zone meets high‑availability thresholds:

private boolean shouldEnableZoneAffinity(List<T> filtered) {
    LoadBalancerStats stats = getLoadBalancerStats();
    if (stats == null) {
        return zoneAffinity;
    } else {
        ZoneSnapshot snapshot = stats.getZoneSnapshot(filtered);
        double loadPerServer = snapshot.getLoadPerServer();
        int instanceCount = snapshot.getInstanceCount();
        int circuitBreakerTrippedCount = snapshot.getCircuitTrippedCount();
        if (((double) circuitBreakerTrippedCount) / instanceCount >= blackOutServerPercentageThreshold.get()
                || loadPerServer >= activeReqeustsPerServerThreshold.get()
                || (instanceCount - circuitBreakerTrippedCount) < availableServersThreshold.get()) {
            return false;
        } else {
            return true;
        }
    }
}

Explanation:

Obtain statistics for the filtered zone.

Calculate average load per server and other metrics.

Check three conditions: blackout percentage, load per server, and minimum available servers. If any condition fails, the zone is not enabled.

3.4 ZoneAwareLoadBalancer

ZoneAwareLoadBalancer is the default LoadBalancer used in production. It extends DynamicServerListLoadBalancer and adds a map of zone‑specific balancers:

// key: zone, value: LoadBalancer for that zone
ConcurrentHashMap<String, BaseLoadBalancer> balancers;

Core logic removes zones that are completely blackout and the worst‑performing zone, then randomly picks a remaining zone:

public Server chooseServer(Object key) {
    if (getLoadBalancerStats().getAvailableZones().size() <= 1) { // 1
        return super.chooseServer(key);
    }
    Server server = null;
    try {
        LoadBalancerStats lbStats = getLoadBalancerStats();
        Map<String, ZoneSnapshot> zoneSnapshot = ZoneAvoidanceRule.createSnapshot(lbStats); // 2
        Set<String> availableZones = ZoneAvoidanceRule.getAvailableZones(
                zoneSnapshot, triggeringLoad.get(), triggeringBlackoutPercentage.get()); // 3
        if (availableZones != null && availableZones.size() < zoneSnapshot.keySet().size()) {
            String zone = ZoneAvoidanceRule.randomChooseZone(zoneSnapshot, availableZones); // 4
            if (zone != null) {
                BaseLoadBalancer zoneLoadBalancer = getLoadBalancer(zone); // 5
                server = zoneLoadBalancer.chooseServer(key);
            }
        }
    } catch (Exception e) {
        logger.error("Error choosing server using zone aware logic for load balancer={}", name, e);
    }
    if (server != null) {
        return server;
    } else {
        return super.chooseServer(key); // 6
    }
}

Explanation:

If only one zone is available, use its balancer directly.

Gather statistics for each zone.

Determine available zones by removing blackout zones and the worst‑performing zone based on configured thresholds ( triggeringLoad and triggeringBlackoutPercentage).

Randomly select one of the remaining zones.

Delegate the request to the selected zone's LoadBalancer.

If no server is found, fall back to the parent class's algorithm.

4. Production Ribbon Configuration

IPing: DummyPing – because the ServerList (DiscoveryEnabledNIWSServerList) already performs health checks via Eureka.

IRule: AvailabilityFilteringRule.

ServerList: YqgZoneAwareServerList – a thin wrapper around DiscoveryEnabledNIWSServerList that adds zone‑aware logic.

ServerListFilter: ZoneAffinityServerListFilter.

LoadBalancer: ZoneAwareLoadBalancer.

With the foundations described above, it becomes straightforward to see how these classes collaborate to provide a highly available client‑side load‑balancing component.

5. Conclusion

Combined with previous blog posts, this microservice technical series now covers most components of the Spring Cloud Netflix suite. Readers are encouraged to read the source code and practice systematic learning of excellent open‑source projects to improve coding skills.

6. Appendix

Links:

[1] https://blog.fintopia.tech/60a1e74c2078082a378ec5e5/

[2] https://blog.fintopia.tech/60868c70ce7094706059f126/

[3] https://blog.fintopia.tech/607d1e7ece7094706059f124/

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Java cloud native load balancing Spring Cloud Ribbon

Written by

Yang Money Pot Technology Team

Enhancing service efficiency with technology.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.