Cloud Native 13 min read

High‑Performance Layer‑4 Software Load Balancer TDLB Based on DPVS and DPDK

The article describes how Trip.com built a high‑performance, software‑based layer‑4 load balancer (TDLB) using DPVS and DPDK, detailing its lock‑free session handling, user‑IP pass‑through, asynchronous logging, cluster session synchronization, resource isolation, configuration management via etcd/operator, health‑check strategies, and multi‑dimensional monitoring.

Ctrip Technology
Ctrip Technology
Ctrip Technology
High‑Performance Layer‑4 Software Load Balancer TDLB Based on DPVS and DPDK

Introduction

Trip.com’s traffic ingress architecture combines layer‑4 and layer‑7 load balancing, but the hardware‑based layer‑4 solution suffered from high cost, long procurement cycles, and limited HA capabilities. To meet rapid business growth, the team sought an open‑source, high‑performance software alternative.

TDLB High‑Performance Implementation

DPDK Integration

DPVS, an open‑source virtual server project, is combined with DPDK to bypass the kernel, allowing user‑space packet polling, reducing context‑switch overhead and improving cache hit rates.

Lock‑Free Session Design

TDLB adopts a full‑NAT mode with per‑core session tables, ensuring that both inbound and outbound traffic for a flow are processed by the same core, eliminating inter‑core lock contention.

User Source IP Pass‑Through

Both TOA (TCP Option Address) and ProxyProtocol are supported, enabling backend services to obtain the original client IP without requiring kernel modules.

Asynchronous Log Writing

Log messages are queued per core and written by a dedicated logging core, avoiding I/O lock contention that could disrupt packet processing or BGP sessions.

Cluster Session Synchronization

In multi‑active mode, session information is synchronized across cores and nodes. Per‑core internal IPs (SNAT IPs) and FDIR are used so that return traffic is steered to the same core that originated the session, preserving flow affinity.

Two synchronization types are provided: incremental sync for new connections and full sync when a new server joins the cluster.

Resource Isolation

RSS and FDIR distribute packets to specific cores, isolating data paths. NUMA‑aware allocation ensures that each core uses local NIC resources, avoiding cross‑NUMA traffic.

Control‑plane traffic (BGP, health checks) is isolated from data‑plane traffic by assigning it to a dedicated queue.

Cluster Configuration Management

Configuration is stored in etcd. Each TDLB instance runs an operator that watches etcd keys, applies changes, and writes back version information, guaranteeing consistent configuration across the cluster.

Health‑Check Strategy

Health checks are performed on every NIC; failures on one NIC affect only the services bound to that NIC, improving fault tolerance.

Multi‑Dimensional Monitoring

Metrics are collected per‑cluster, per‑server, per‑service, and per‑core using DPDK latency stats, and are exported to Prometheus/Grafana with alerts for rapid fault localization.

Conclusion

TDLB, built on DPVS and DPDK, has operated stably for nearly two years, supporting Trip.com’s services with lower cost, higher performance, and seamless integration into the private cloud, demonstrating the value of adopting open‑source solutions.

Cloud NativeKubernetesLoad Balancingnetwork performanceDPDKDPVS
Ctrip Technology
Written by

Ctrip Technology

Official Ctrip Technology account, sharing and discussing growth.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.