Cloud Computing 16 min read

How Intel’s HDSLB Redefines High‑Performance L4 Load Balancing for Cloud and Edge

Intel’s HDSLB series introduces a high‑density, scalable L4 load balancer that leverages Intel Xeon CPU instructions and E810 NIC acceleration to deliver multi‑core linear performance, superior throughput, and robust features for cloud and edge networking, outperforming traditional LB solutions.

AI Cyberspace

May 15, 2024

How Intel’s HDSLB Redefines High‑Performance L4 Load Balancing for Cloud and Edge

Introduction and Background

In the era of rapid cloud computing, SDN, and NFV adoption, the growing number of cloud users and expanding data‑center scale make cost‑effective scaling essential. Network performance remains a perpetual concern because the architecture of cloud‑centric systems ties overall cost and competitiveness to the efficiency of the underlying network.

The performance goals of cloud networking can be divided into physical bandwidth, virtual tunnel forwarding, L4 load‑balancing, and application‑layer I/O processing. As data‑center and edge bandwidth demands rise, the performance of the load‑balancer at the network entry point becomes critical, which is the focus of this series on Intel HDSLB, a high‑density, scalable L4 load balancer built with a software‑hardware co‑acceleration approach.

Limitations of Traditional Load‑Balancing Technologies

Traditional LB solutions—such as LVS & Keepalived, HAProxy & Keepalived, and Nginx & Keepalived—provide high availability and L4‑L7 proxy capabilities but encounter performance bottlenecks, poor scalability, and limited cloud‑native adaptability when deployed in large‑scale cloud infrastructures.

LB Algorithms : How to intelligently distribute traffic according to diverse application scenarios?

High Availability : Should the LB node use active‑passive or active‑active redundancy?

Reverse Proxy : Support for TCP, UDP, SSL, HTTP, FTP, ALG, and other protocols?

High Performance : Achieve higher bandwidth, lower latency, higher CPS, and larger backend server pools?

Clusterability : Provide horizontal scaling capabilities?

Features and Advantages of Intel HDSLB

Intel initiated the High‑Density Scalable Load Balancer (HDSLB) project to deliver a best‑in‑class L4 load balancer. Its key characteristics are:

High Density : Extremely high TCP concurrent connections and throughput per node.

Scalable : Performance scales linearly with CPU core count or total resources.

HDSLB‑DPVS is open‑source on GitHub, while HDSLB‑VPP is a commercial version with additional advanced features.

Higher Performance : Single‑node throughput up to 150 Mpps, 100 M concurrent TCP connections, 10 M new TCP connections per second.

Hardware Acceleration : Leverages Intel Xeon AVX2/AVX‑512 instructions and Intel E810 100 GbE NIC features such as SR‑IOV, FDIR, RSS, DLB, DSA, DDP, ADQ.

Multi‑core Scaling : Throughput grows linearly with core count.

Flexible Horizontal Scaling : Native NFV support for dynamic scaling.

Multiple LB Algorithms : RR, WLC, Consistent Hash, etc.

Various LB Modes : FULL‑NAT, SNAT, DNAT, DR, IP‑IP tunneling.

HA Clustering : Keepalived‑based active‑passive high availability with session sync.

Performance Metrics

Benchmark results from Volcano Engine show that HDSLB‑VPP achieves 8 Mpps single‑core throughput and 880 K TCP CPS, both exhibiting linear scaling across cores.

Comparison with Competitors

Against a typical open‑source L4 LB (CPU E5‑2650, 2K‑10K TCP sessions per core, 64‑byte UDP), HDSLB‑VPP on a 3rd‑gen Xeon‑SP (10 M TCP sessions per core) delivers over three times the single‑core throughput in FNAT IPv4 tests and five times higher TCP CPS.

Optimizations in the VPP‑based data structures enable up to 100 M concurrent TCP sessions with the same memory footprint, extending to 500 M in FNAT mode and 1 B in NAT mode.

Typical Application Scenarios

HDSLB is primarily deployed as an L4 network element in cloud and edge computing environments.

Massive Baseline Traffic : Cloud tenants generate huge, rapidly changing traffic; HDSLB’s horizontal and multi‑core scaling meets this demand while reducing server procurement costs.

Elephant Flows : Unpredictable large‑packet flows require high per‑core packet‑processing performance; HDSLB‑VPP’s Intel DLB acceleration delivers near‑line‑rate handling for 96‑512 byte packets.

Low‑Latency Edge Workloads : OT/CT vertical industries demand strict latency and jitter; Intel E810/IPU NIC optimizations provide low‑latency, jitter‑resistant transmission.

Strong Edge‑Node Capability : Limited physical space at edge sites calls for high per‑node performance, which HDSLB achieves through comprehensive CPU, SmartNIC, and IPU tuning.

Future Roadmap

Support for hundreds of millions of concurrent TCP connections.

Single‑core throughput exceeding 8 Mpps with linear scaling.

TCP connection‑setup rate above 800 K CPS per core, scaling linearly.

Elephant‑flow processing powered by 4th‑gen Xeon‑SP accelerators.

QoS traffic‑shaping capabilities.

Integrated Anti‑DDoS security features.

With Intel’s heterogeneous compute ecosystem and open‑source communities such as DPDK and VPP, HDSLB is positioned to expand into more scenarios as cloud‑gateway NFV and hardware‑based edge gateways converge.

References

https://blog.csdn.net/weixin_37097605/article/details/131098713

https://www.intel.cn/content/www/cn/zh/customer-spotlight/cases/volcano-engine-edge-cloud-balance-hdslb.html

https://www.intel.cn/content/dam/www/central-libraries/cn/zh/documents/2022-11/22-cmf233-vivo-works-with-to-optimize-hdslb-significantly-improving-load-balancing-systems-soution-briefs.pdf

https://blog.csdn.net/Jmilk/article/details/129939424

https://www.intel.cn/content/www/cn/zh/customer-spotlight/cases/volcano-engine-edge-cloud-balance-hdslb.html

https://www.intel.cn/content/dam/www/central-libraries/cn/zh/documents/2023-01/23-22cmf255-volcano-engine-edge-cloud-sees-great-optimazation-in-four-tier-load-balancing-performance-with-hdslb-built-on-intel-hardware-and-software-case-study.pdf

https://networkbuilders.intel.com/solutionslibrary/intel-dynamic-load-balancer-intel-dlb-accelerating-elephant-flow-technology-guide

https://networkbuilders.intel.com/docs/networkbuilders/high-density-scalable-load-balancer-a-vpp-based-layer-4-load-balancer-technology-guide-1701169184.pdf

Hardware Acceleration Intel cloud-computing L4 load-balancing

Written by

AI Cyberspace

AI, big data, cloud computing, and networking.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.