Operations 18 min read

How to Build a 10,000‑Server Data Center Network in Hours with Big Bang

This article explains how UCloud’s Big Bang system automates data‑center network construction—from manual wiring and IP allocation to layered design, resource allocation, configuration generation, and zero‑touch provisioning—enabling a ten‑thousand‑server network to be deployed in just a few hours.

UCloud Tech
UCloud Tech
UCloud Tech
How to Build a 10,000‑Server Data Center Network in Hours with Big Bang

1. Limitations of the Previous System

Before Big Bang, UCloud's physical network team operated a first‑generation data‑center construction system. As business and architecture evolved, several shortcomings became apparent: insufficient flexibility for multi‑plane Fabric networks; rigid architecture rule descriptions requiring heavy developer effort for each iteration; unintuitive atomic command templates that were hard to predict and test; and high failure rates when deploying overseas due to network latency.

Inadequate support for multi‑plane Fabric networks.

Architecture rule descriptions are inflexible, causing strong coupling between network design and development.

Atomic command‑library templates are not intuitive, leading to high maintenance cost and error‑prone manual edits.

High failure rates in overseas deployments due to network latency.

From this analysis we derived the requirements for a new system: flexible multi‑version, multi‑plane support; architecture iteration without developer involvement; intuitive template editing with low‑cost automated testing; and resilience under weak‑network conditions.

2. Big Bang Design Overview

2.1 The Pain of Manual Operations

Consider a simple two‑tier CLOS architecture: 4 spine switches and 32 leaf switches (36 devices) forming a typical POD. Engineers must first produce a wiring table (e.g., "spine01.p1 connects to leaf01.p49 via QSFP28 100G cable"), resulting in 4 × 32 = 128 inter‑connect relationships. After wiring, IP address allocation is required for each inter‑connect, management, and business network, leading to hundreds of IP segments that must be manually entered into an IP management system and matched one‑by‑one with devices.

These calculations grow dramatically with more devices and interfaces, and additional constraints—such as role‑specific wiring rules, cross‑site cable selection, and material mapping—further increase complexity.

Data center network example
Data center network example

2.2 Automate or Die

The construction process involves massive repetitive work:

Complex wiring and IP address allocation.

Repeated translation of configuration files.

Role‑specific configuration differentiation.

Manual execution is slow and error‑prone, exactly where programs excel.

2.3 System Layered Design

We abstract the construction process into three loosely coupled components that correspond to the three stages of the Big Bang system:

Planning – Architecture Library : a declarative specification similar to a structural design standard.

Allocation – Resource Allocator : computes required materials and generates a connection graph.

Construction – Configuration Generator : renders device configurations and pushes them to the switches.

Benefits include loose coupling, isolated changes, and easy module replacement.

3. Architecture Library

To describe a data‑center topology we need a graph consisting of nodes (devices) and edges (links) together with IP attributes. We abstract devices into roles and IP ranges into assignable segments, yielding two trees – a device tree and an IP‑segment tree – whose leaves are linked.

Device tree and IP tree
Device tree and IP tree

These trees capture the five essential pieces of information:

Which roles (devices) exist.

Which IP segments are available for allocation.

How roles interconnect.

Which IP segments each role (e.g., loopback) uses.

Which IP segments are used for the inter‑role links.

Additional attributes such as business tags on IP segments or port‑allocation rules on roles are treated as leaf‑node properties.

4. Resource Allocator

The allocator receives the architecture library output, pulls resources from CMDB/IPAM, assigns locally unique ID groups (similar to variable scopes), and produces a connection graph containing:

Device information (ID, management IP, etc.).

Link information (endpoints, IPs, ports, etc.).

This graph is later consumed by the configuration generator.

5. Configuration Generator

5.1 Generating Configurations

Device commands fall into three categories: global settings (AAA, security policies), role‑specific settings, and inter‑role link settings (BGP peers, layer‑3 interfaces, etc.). A Jinja2 template renders these commands using variables from the connection graph.

hostname LAS.xxxxxx
#
interface HundredGigabitEthernet 0/1
no switchport
description To_LCS
ip address 10.68.176.53 255.255.255.254

The same template is used for all devices; the renderer substitutes {{ device_name }}, {{ pair["local_interface_name"] }}, {{ conn["peer_device"]["device_name"] }}, {{ ip["ip"] }}, {{ ip["mask"] }}, etc.

5.2 Zero‑Touch Provisioning (ZTP)

After configuration files are generated, they are delivered to switches via ZTP: a DHCP/TFTP workflow similar to PXE boot, allowing devices to load their startup config automatically, even across multiple vendors.

Connectivity checks verify that all devices are reachable.

Config diffing ensures the running config matches the intended startup config.

LLDP validation confirms correct cabling.

6. Conclusion

Building a data‑center network capable of supporting ten thousand servers can be completed in a few hours using the Big Bang system. The approach shifts engineers from tedious manual tasks to higher‑value work, delivering a reliable, scalable network foundation.

DSLConfiguration Managementresource allocationData CenterNetwork Automationzero touch provisioning
UCloud Tech
Written by

UCloud Tech

UCloud is a leading neutral cloud provider in China, developing its own IaaS, PaaS, AI service platform, and big data exchange platform, and delivering comprehensive industry solutions for public, private, hybrid, and dedicated clouds.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.