Evolution of Data Center Network Architecture at Qunar: From Traditional STP to Leaf‑Spine and VXLAN
The article outlines Qunar's data‑center network evolution, describing the limitations of traditional STP‑based designs, the adoption of vPC for active‑active redundancy, the transition to leaf‑spine topology for scalability, and the implementation of VXLAN to support large‑scale multi‑tenant cloud environments.
Background: Rapid development of data‑center networks has introduced virtualization, private/public/hybrid clouds, SDN, and big‑data technologies, shifting traffic from north‑south client‑server flows to east‑west intra‑datacenter communication, prompting the need for higher performance, scalability, and reliability.
Backbone overall network architecture : Same‑city data centers are linked by physical links, while remote sites connect via IP tunnels. Design principles include a ring topology to avoid single points of failure, flexible routing for easy management, and wavelength‑division multiplexing for smooth bandwidth scaling.
Current online network structures
First‑generation: Traditional STP‑based two‑layer network.
The traditional STP network suffers from convergence delays after link or switch failures, under‑utilized blocked links, sub‑optimal forwarding due to a single root switch, lack of ECMP, and broadcast storms, all of which degrade reliability.
Second‑generation network: Switch virtualization (vPC)
Virtual PortChannel (vPC) allows downstream devices to connect to a pair of switches, which appear as a single logical switch. Both links remain active, enabling active‑active forwarding and hash‑based load balancing, eliminating the need for STP and its blocked ports.
Benefits of vPC include cross‑link redundancy for downstream devices, removal of STP, increased uplink bandwidth, minimal impact from a single link failure, high reliability with dual‑active operation, simplified topology, and smoother network upgrades.
However, MAC table flushing during switch flaps can still cause brief server interruptions.
Third‑generation network: Leaf‑Spine architecture
The leaf‑spine (distributed core) design separates leaf switches (connecting servers) from spine switches (inter‑connecting leaves), providing low‑latency, non‑blocking connectivity and supporting ECMP for equal‑cost multipath routing.
In Qunar's CNB data center, core1 and core2 act as spines, leaf switches connect servers, and the spine‑leaf fabric runs OSPF with ECMP, isolating L2 broadcast domains so that link flaps affect only a single leaf group.
VXLAN private‑cloud implementation
VXLAN (Virtual Extensible LAN) encapsulates Layer‑2 frames in UDP over an IP network, enabling large‑scale multi‑tenant overlays beyond the 4094 VLAN limit, supporting overlapping IP spaces, VM mobility, and overcoming ToR MAC table constraints.
VLAN ID space is insufficient for massive cloud data centers.
Multi‑tenant isolation and IP overlap are required.
VM migration flexibility.
ToR MAC table size limitations.
The VXLAN design uses two core routers as route reflectors; core1 and core2 run iBGP with access switches, and all VTEPs share an Anycast Gateway MAC for tenant VNI reachability. MP‑BGP EVPN carries both L2 and L3 reachability, providing bridging and routing in the overlay.
Qunar Tech Salon
Qunar Tech Salon is a learning and exchange platform for Qunar engineers and industry peers. We share cutting-edge technology trends and topics, providing a free platform for mid-to-senior technical professionals to exchange and learn.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.