How UCloud Leveraged P4 Programmable Switches to Revolutionize Cloud Networking
This article chronicles UCloud's evolution from early Open vSwitch‑based SDN to DPDK gateways, evaluates excluded VXLAN and OpenFlow solutions, and explains how adopting Barefoot's P4 programmable switches enabled higher performance, flexible control planes, sharding, and seamless grey‑release upgrades for its cloud network.
DPDK's shortcomings
With the rapid adoption of 25G networks in 2017, UCloud faced new challenges: DPDK‑based applications achieve high packet rates only through multi‑server, multi‑core load balancing, which cannot be software‑defined; large "elephant" flows can congest a single NIC or CPU core. Scaling to higher bandwidths (40G, 100G) demands more powerful, costly CPUs.
Two excluded solutions
UCloud evaluated two alternatives in 2017. The first, VXLAN VTEP‑based solutions from switch vendors, offered a complete SDN stack but were closed, non‑standard, and limited to ~100k MAC addresses, lacking Ethernet‑over‑GRE support. The second, hardware switches supporting OpenFlow 1.3, showed a large gap with OVS: they could not import OpenFlow flows, nor support Ethernet‑over‑GRE or Flow‑Based Tunneling.
P4 enters the scene
In Q4 2017, UCloud began researching Barefoot's P4‑programmable switches (Tofino). P4, co‑created by Nick McKeown and the Nicira team, allows protocol‑independent packet processing, enabling flexible, interruption‑free reconfiguration and device‑agnostic forwarding.
P4 Switch Architecture
The control plane was built from scratch rather than trimming the original switch.p4. The NOS layer leverages Linux: non‑GRE packets are sent to the CPU via a virtual NIC, then processed by the kernel (ARP, routing). User‑space programs like Quagga run BGP over the virtual interface, while the bf_switchd plugin interacts with the Tofino chip via netlink and generated P4 APIs.
Initial control used Apache Thrift, which limited batch configuration. Replacing it with a gRPC server increased configuration throughput eightfold, with future plans to adopt P4Runtime and Stratum.
Sharding
Switch performance is constrained by DRAM and TCAM. To overcome single‑chip resource limits, UCloud shards data per tenant using a 64‑port P4 switch. The lower six bits of the VNI define 64 shards, each assigned a next‑hop, enabling horizontal scaling of the cluster.
Grey‑release capability
UCloud implements per‑account grey releases for switch software upgrades. The process includes deploying a grey switch with the new version, defining new data shards for selected accounts, routing traffic to the grey switch based on VNI + IP, performing automatic post‑upgrade connectivity tests, and gradually expanding the grey traffic until the entire VPC and eventually all VPCs are migrated, after which the old switches are decommissioned.
P4 Switch Applications
UCloud plans to use P4 switches for various scenarios: enhanced tenant switching and routing, bare‑metal access to virtual networks, consistent‑hash ECMP load balancing, traffic shaping and billing, ARP proxy, and more.
Currently, the next‑generation UXR gateway built on P4 switches has been tested and deployed in a region with grey‑traffic validation.
Conclusion
After the Tofino chip entered mass production in early 2018, P4 programmable switches began appearing in the market. Although still emerging and with some limitations, UCloud finds that they offer engineers unparalleled flexibility compared to traditional switches. Collaboration with Barefoot reinforces the belief that "Software is eating the network".
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
UCloud Tech
UCloud is a leading neutral cloud provider in China, developing its own IaaS, PaaS, AI service platform, and big data exchange platform, and delivering comprehensive industry solutions for public, private, hybrid, and dedicated clouds.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
