What Went Wrong in a Real-World Network Migration? Lessons Learned
An in‑depth post‑mortem of a flawed network cut‑over project between two merged units, detailing the original design, traffic analysis, OSPF and PBR configurations, step‑by‑step migration procedures, encountered issues, root‑cause investigations, and key takeaways for future network operations.
Today I share a real‑world cut‑over case that turned out to be imperfect; the reasons become clear by the end.
Background
Two business units, A and B, with similar services needed to merge. Both required access to the Beijing headquarters and Guangdong branch, so the upper management decided to consolidate the two networks to avoid duplicate investment and improve ROI.
The cut‑over goal was to migrate all B‑unit services and links to the A‑unit network while keeping B’s hardware.
Network Architecture
Both units used a two‑layer architecture: an access layer for workstation connectivity and a core layer for VLAN routing.
A’s core consists of two stacked H3C IRF devices; B’s core consists of two stacked Huawei S7706 CSS devices. During the transition two independent inter‑core links were provisioned: a three‑layer link for traffic forwarding between sites and a two‑layer link carrying all VLANs for the cut‑over.
North‑bound connections: one link on Huawei to the Beijing data center, another on H3C to the Guangdong branch, then onward to Beijing.
IP segments: 10.6.0.0/16 (segment A) uses the H3C north‑bound interface (g0/0/1) to Beijing; 10.40.0.0/16 (segment B) uses the Huawei north‑bound interface (g0/0/2) to Guangdong. Both segments must inter‑communicate but retain separate north‑bound exits.
Requirement Analysis
Traffic analysis showed:
Mutual access between A and B accounts for less than 10% of traffic.
Access to Beijing headquarters exceeds 80% of traffic; A reaches Beijing via H3C → Guangdong, B directly via Huawei.
Access to Guangdong branch is under 10%; A uses H3C, B uses Huawei via Beijing.
Routing Design
Before migration, A’s core ran static and default routes; B’s core ran OSPF with an ABR exchanging LSA and advertising a 10.0.0.0/7 summary route.
During the transition, H3C added OSPF and PBR to redistribute direct and static routes to Huawei, enabling B‑segment reachability. The intended routing policies were:
A‑VLAN → A‑VLAN: direct + static routes.
A‑VLAN → B‑VLAN: OSPF type 1/2.
A‑VLAN → external: default route (later changed to PBR).
B‑VLAN → A‑VLAN: OSPF type 1/2.
B‑VLAN → B‑VLAN: direct + OSPF type 1/2.
B‑VLAN → external: OSPF type 3.
Preparation Work
Key steps before the cut‑over:
Analyze business requirements and traffic patterns.
Document existing configurations and create a migration window.
Plan OSPF and PBR changes.
Engineering Operations
Cut‑over steps:
Move one Huawei‑to‑Beijing three‑layer link to H3C and enable OSPF on H3C to exchange routes with Beijing and Huawei.
Add all B‑segment VLANs to H3C, delete them from Huawei, and attach them via the two‑layer inter‑core link.
Replace B‑segment external PBR with OSPF type 3 routing; change AB inter‑visit from pure OSPF to direct + OSPF.
Deploy PBR on all A‑segment VLAN SVIs to route external traffic via Guangdong (replacing static routes).
Migrate the second Huawei north‑bound three‑layer link and the remaining B‑segment links to H3C, making H3C the sole core.
Issues Encountered
Insufficient preparation (no proper requirement analysis, traffic analysis, pre‑cut‑over testing, or rollback plan) caused many problems. Specific incidents included:
After VLAN SVI migration, both A and B could not reach the Internet; the fix was to change A to PBR and B to OSPF.
PBR ACL misconfiguration caused all internal traffic to be sent to the Guangdong north‑bound interface. The original ACL snippet:
acl advanced 3000 rule 5 deny ip destination 10.40.0.0 0.0.255.255
rule 10 deny ip destination 10.6.0.0 0.0.255.255
rule 15 permit ip source 10.40.0.0 0.0.255.255 traffic classifier tc1 operator and if-match acl 3000 traffic behavior tb1 apply next-hop 10.40.254.2 qos policy q1 classifier tc1 behavior tb1 control-plane qos apply policy q1 inboundSwitching to interface‑based policy‑based routing and adjusting the ACL resolved the issue:
acl advanced 3000 rule 5 permit ip destination 10.40.0.0 0.0.255.255
rule 10 permit ip destination 10.6.0.0 0.0.255.255
acl advanced 3001 rule 5 permit ip source 10.40.0.0 0.0.255.255
policy-based-route test deny node 5 if-match acl 3000
policy-based-route test permit node 10 if-match acl 3001 apply next-hop 10.40.254.2 interface vlan 10 ip policy-based-route testLater, after the cut‑over, ping and traceroute succeeded but A‑segment users could not access Beijing web services. Investigation revealed that H3C had inadvertently redistributed all direct and static routes into OSPF, causing Beijing to use detailed routes instead of the aggregated summary, breaking stateful firewall checks. Removing the import‑route commands restored connectivity.
Work Summary & Lessons
Key take‑aways:
Never start a cut‑over without a complete requirement and traffic analysis.
Use routing tables to debug outbound issues and LSDB for inbound problems.
Ping/traceroute alone do not guarantee application‑level connectivity; asymmetric routing can break TCP sessions.
Routing design must match business needs; source‑address‑based PBR for A and dynamic OSPF for B proved optimal.
Avoid injecting external routes into OSPF unless required.
Test all configuration changes in a lab that mirrors production to catch ACL/PBR errors early.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Open Source Linux
Focused on sharing Linux/Unix content, covering fundamentals, system development, network programming, automation/operations, cloud computing, and related professional knowledge.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
