Cloud Computing 16 min read

How UCloud Scales IPv6 with NAT64 and P4: Architecture, Performance & Lessons

This article explains how UCloud addresses IPv4 limitations by deploying a free IPv6 conversion service built on stateful NAT64 and programmable P4 switches, detailing the strategic rollout, system architecture, high‑availability and security mechanisms, load‑balancing algorithms, P4 table optimizations, and performance test results.

UCloud Tech
UCloud Tech
UCloud Tech
How UCloud Scales IPv6 with NAT64 and P4: Architecture, Performance & Lessons

IPv4 suffers from address exhaustion, security, QoS, and routing issues that hinder cloud computing development; IPv6 offers a larger address space and better security, solving these problems.

Since early 2018, UCloud has developed a free public‑entry IPv6 conversion service based on NAT64 and programmable P4 switches. Users can enable IPv6 with a single click after obtaining an EIP, without any infrastructure changes, and the service now supports cloud hosts, EIPs, load balancers, container clusters, bastion hosts, etc. A single cluster (16 NAT64 servers, 4 P4 switches) can handle up to 3.2 M CPS and 1.6 G concurrent connections, with smooth future scaling.

UCloud IPv6 Evolution Strategy

UCloud began IPv6 research years ago and follows a phased roadmap:

2018: Launch public‑entry IPv6 conversion, enabling over 50% of products to support IPv6 without business changes.

2018: Complete IPv6 upgrade of the management network, allowing internal cloud services to use IPv6.

2019 Q2: Major products such as VPC and ULB gain IPv6 capability.

2019: Finish dual‑stack IDC upgrade, providing full IPv6 support inside data centers.

NAT64 and Its Technical Challenges

UCloud implements stateful NAT64, which requires at least one IPv4 address and a /96 IPv6 prefix to translate between IPv4 and IPv6. NAT64 embeds the IPv4 address into the IPv6 destination address and creates a mapping that can be configured manually or automatically.

Key challenges are high availability and security protection, as the stateful service must preserve existing connections during node changes and resist DDoS attacks.

System Architecture

The architecture combines NAT64 with P4 switches. NAT64 Access runs on P4 hardware, using consistent hashing for high availability and CPS rate‑limiting for DDoS mitigation.

NAT64 Access and a physical switch form a three‑layer network, announcing a /96 IPv6 prefix via BGP. POP points share the same prefix for load balancing and disaster recovery. The second layer connects NAT64 servers, with VIPs announced via BGP to the Access switches.

When P4 Meets NAT64

To achieve high availability, NAT64 Access selects backend nodes using a consistent‑hash gateway. DPDK, a common solution, has drawbacks such as reliance on hardware load‑balancing and high CPU costs at higher speeds. Therefore, UCloud adopted Barefoot Tofino‑based P4 switches, which provide 1.8–6.4 Tbps forwarding, stable performance independent of CPU load, 100 Gbps line rate, rich programmability, and a strong ecosystem.

Choosing and Validating Maglev Hash

UCloud selected Google’s Maglev hash for its stable lookup table size and minimal disruption when backend nodes change. Tests showed that with a lookup table size of 1024, about 2 % of connections were affected during scaling; increasing the table size to >2000× the number of backends reduced disruption to <0.01 %.

NAT64 Access Work Flow

The Access device holds a Maglev‑generated lookup table mapping VIPs to backend MAC addresses. Incoming packets are hashed (source/destination IP and ports) to obtain an entry index, which selects the appropriate backend MAC.

DDoS Protection

Each EIP’s inbound/outbound TCP SYN packets are rate‑limited (default 50 000 pps) as a secondary safeguard, complementing UCloud’s upstream DDoS mitigation.

P4 Table Configuration Optimization

Tofino chips have four pipelines with twelve stages each. By reducing inter‑table dependencies and splitting logic between ingress and egress, resource utilization rose from ~30 % to ~70 %.

System Performance Testing

Testing a single NAT64 server (32‑core CPU, 64 GB RAM, dual 10 Gb NIC) with bidirectional UDP traffic yielded the following results:

CPS peak: 3.2 M CPS per region (16 servers).

Concurrent connections: 1.6 G.

The service is currently in free public beta.

IPv6load balancingPerformance Testingcloud networkingNAT64UCloudP4
UCloud Tech
Written by

UCloud Tech

UCloud is a leading neutral cloud provider in China, developing its own IaaS, PaaS, AI service platform, and big data exchange platform, and delivering comprehensive industry solutions for public, private, hybrid, and dedicated clouds.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.