How UCloud Transformed Its VPC: From Classic Networks to Cloud‑Native Performance
This article chronicles UCloud's VPC evolution—from early classic two‑layer networks through SDN‑based VPC 2.0 and the hardware‑integrated VPC 3.0 architecture, covering microservice migration, telemetry, dynamic flow learning, and high‑performance hardware offload to meet modern cloud networking demands.
Early Classic Network
In the initial stage, UCloud's data‑center used a classic two‑layer network where cloud hosts and hypervisors shared a large L2 domain, relying on Linux bridge for forwarding and iptables/ebtables for isolation.
The classic approach suffered from scale limits, performance bottlenecks, and inflexible IP allocation.
Scale issue: Broadcast domains limited growth, causing storms and MAC table exhaustion.
Performance issue: Linux bridge and growing iptables rules reduced forwarding efficiency.
Topology issue: Uniform IP allocation prevented customers from designing their own address spaces.
VPC 2.0 Architecture: SDN‑Based VPC
At the end of 2016, UCloud launched VPC 2.0, introducing SDN virtualization with Open vSwitch, OpenFlow, and an SDN controller to manage flows.
Flow rules are delivered via a hybrid Packet‑In and push mechanism: routing and ACL flows are pushed proactively, while point‑to‑point flows use Packet‑In to request controller assistance.
DPDK‑based gateways (load‑balancer, hybrid‑cloud, bare‑metal) interact directly with the VPC for east‑west and north‑south traffic.
Operational challenges of VPC 2.0 included first‑packet latency from Packet‑In, coupling of data‑plane and control‑plane traffic, heterogeneous network integration complexity, and limited OVS performance.
Packet‑In caused noticeable first‑packet delay, especially for latency‑sensitive workloads.
Control‑plane traffic became a target for DDoS and internal scans.
Multiple heterogeneous gateways forced VPC control logic to be scattered.
OVS performance was constrained by kernel‑level locking and queuing.
VPC 3.0 Architecture: Integrated Soft‑Hardware VPC
To address VPC 2.0 shortcomings, UCloud built VPC 3.0, a tightly coupled soft‑hardware solution featuring kernel OVS, hardware‑offload OVS, smart NICs, P4 programmable pipelines, and DPDK.
The control plane introduces four layers: Model, Middle, Mapping, and DataPath. Business objects (e.g., Subnet) are created in the Model layer, routed through the Middle layer, mapped to specific forwarding entities (OpenFlow, P4, TC) in the Mapping layer, and finally pushed to the appropriate data‑path devices.
Dynamic learning combines proactive push with on‑the‑fly flow discovery using a P4‑based BGW gateway and a Datapath Control Protocol (DCP) to offload flows back to OVS.
Learning occurs on the data plane, delivering higher performance than control‑plane pushes.
Traffic continues to flow during learning, eliminating first‑packet delay.
On‑demand learning reduces the number of pushed flows, improving efficiency.
Control‑Plane Middle Services
Common capabilities such as object routing, consistency caching, sharding, and gray‑release are provided by a middle‑layer service, enabling rapid reuse across products and forwarding devices.
Hardware Offload
Forwarding has progressed from kernel bridge to hardware‑offloaded OVS and smart NICs, achieving up to 25 Gbps, 10 Mpps, and 10 Gbps external bandwidth for fast‑instance hosts.
Gateways have also moved from pure DPDK to P4 programmable chips, delivering features such as ARP proxy, flow offload, ECMP‑based load balancing, Maglev hashing, and support for both IPv4 and IPv6 overlays.
Heterogeneous Network Decoupling
UXR‑style centralized gateways decouple heterogeneous networks from VPC, shrinking the network boundary and simplifying integration.
Microservice Migration
UCloud transitioned from a monolithic framework (TCP + Protobuf) to a microservice architecture using Istio, Kubernetes, and gRPC, gaining tighter service cohesion, rapid iteration, elastic scaling, fine‑grained traffic gray‑release, and advanced traffic management (retries, rate‑limiting, circuit breaking).
Telemetry and Fault Localization
To address the growing scale of cloud networks, UCloud built a high‑performance end‑to‑end telemetry system that requires only overlay/underlay devices to support traffic mirroring (ERSPAN). By injecting INT packets and collecting mirrored traffic, the system reconstructs end‑to‑end communication status, approximate latency, and actual path.
Combined with active‑flow analysis, this enables rapid verification of traffic changes and quick detection of anomalies.
Summary
Under the latest VPC 3.0 architecture, UCloud VPC delivers high‑performance forwarding (up to 10 Mpps intra‑network, 10 Gbps per EIP), native IPv6 support, and fine‑grained ACL/security‑group controls. The team will continue to monitor networking hardware and software advances to provide secure, stable, and high‑throughput cloud services.
UCloud Tech
UCloud is a leading neutral cloud provider in China, developing its own IaaS, PaaS, AI service platform, and big data exchange platform, and delivering comprehensive industry solutions for public, private, hybrid, and dedicated clouds.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
