Tag

network operations

0 views collected around this technical thread.

Alibaba Cloud Infrastructure
Alibaba Cloud Infrastructure
Jun 11, 2025 · Cloud Computing

How Alibaba’s Qi Tian Platform Secures Large-Scale Cloud Networks

This article examines Alibaba Cloud’s Qi Tian integrated operation‑management platform, detailing the challenges of massive cloud network management and the innovative data‑fusion, automated change, intent‑aware monitoring, and multi‑plane self‑healing technologies that enable secure, high‑performance operation at million‑device scale.

AICloud Computingdata management
0 likes · 11 min read
How Alibaba’s Qi Tian Platform Secures Large-Scale Cloud Networks
Zhuanzhuan Tech
Zhuanzhuan Tech
Feb 21, 2024 · Operations

Network Operations Incident Report: BGP Routing Failure and Resolution

This report details a network operations incident where a BGP routing change caused an EBGP neighbor to go idle, outlines the step‑by‑step troubleshooting, analysis of the root cause, and the implemented solution involving a new L3 node and redundant EBGP peers.

BGPCloud Networkingincident response
0 likes · 8 min read
Network Operations Incident Report: BGP Routing Failure and Resolution
Mike Chen's Internet Architecture
Mike Chen's Internet Architecture
Feb 7, 2024 · Operations

Understanding Load Balancing: Principles, Types, and Application Scenarios

This article explains the fundamentals of load balancing, covering its principles, classifications from layer 2 to layer 7, common software implementations, and typical application scenarios such as high traffic handling, horizontal scaling, fault tolerance, and multi‑zone disaster recovery.

Distributed SystemsHigh AvailabilityLoad Balancing
0 likes · 8 min read
Understanding Load Balancing: Principles, Types, and Application Scenarios
DataFunSummit
DataFunSummit
Dec 13, 2023 · Artificial Intelligence

Building an Event Knowledge Graph for Telecom Network Operations: AI R&D Center Case Study

This article details how China Telecom's AI R&D Center constructs a network‑operation event knowledge graph using AI techniques and Neo4j, covering the operational challenges, ontology design, data extraction pipelines, system architecture, practical applications, and future outlooks.

AINeo4jUIE model
0 likes · 17 min read
Building an Event Knowledge Graph for Telecom Network Operations: AI R&D Center Case Study
DataFunTalk
DataFunTalk
Oct 15, 2023 · Artificial Intelligence

Building an Event Knowledge Graph for Telecom Network Operations

This article describes how China Telecom's AI R&D Center designs and implements a network operations event knowledge graph using AI techniques, graph databases, and UIE models to improve fault handling, automate recommendations, and enhance intelligent assistance for telecom network maintenance.

AINeo4jUIE model
0 likes · 16 min read
Building an Event Knowledge Graph for Telecom Network Operations
vivo Internet Technology
vivo Internet Technology
Aug 2, 2023 · Operations

sFlow-Based Network Traffic Analysis System Design and Implementation

The paper presents a scalable sFlow‑based traffic analysis system that combines high‑performance agents, collectors, and analyzers—extending Elastiflow with sFlowtool, Logstash, Kafka, and Elasticsearch/Kibana, while adding CMDB integration, Druid storage, and Celery stream processing to achieve sub‑30‑second latency for data‑center monitoring, anomaly detection, and IP‑level analytics, and discusses future needs for broader protocol support and adaptive collection.

CeleryDruidELK
0 likes · 12 min read
sFlow-Based Network Traffic Analysis System Design and Implementation
Bilibili Tech
Bilibili Tech
May 5, 2023 · Operations

DWDM-Based Data Center Interconnect Architecture and Operational Optimization at Bilibili

Bilibili's system department network team designed and optimized DWDM-based data center interconnect architecture across optical and electrical layers, detailing multi‑active DC architecture, wavelength evolution, high‑speed coherent modules, phased deployment, automated fault detection, and ROADM/flex‑grid improvements boosting spectrum efficiency by ~20%.

DWDMFlex GridOptical Networking
0 likes · 17 min read
DWDM-Based Data Center Interconnect Architecture and Operational Optimization at Bilibili
Efficient Ops
Efficient Ops
Apr 14, 2023 · Operations

Agile Perception, Precise Decisions: AI‑Driven Smart Network Operations

At the 20th GOPS Global Operations Conference in Shenzhen, Huawei expert Liu Yuliang outlined how AI and data can transform telecom network management from a network‑centric to a business‑centric, self‑driving model, highlighting key solutions such as ChatOps, EDNS, AABD Pro, and cross‑vendor topology reconstruction.

AIAIOpsAutomation
0 likes · 5 min read
Agile Perception, Precise Decisions: AI‑Driven Smart Network Operations
Architects' Tech Alliance
Architects' Tech Alliance
Apr 10, 2023 · Operations

Common Routing Loop Pitfalls and Mitigation Strategies in Enterprise Networks

This article examines three real‑world routing loop incidents caused by static‑route misconfiguration, missing passive‑interface settings in OSPF, and improper bidirectional redistribution, and provides detailed analysis and practical recommendations to prevent such loops in enterprise network operations.

BGPBest PracticesOSPF
0 likes · 13 min read
Common Routing Loop Pitfalls and Mitigation Strategies in Enterprise Networks
Top Architect
Top Architect
Jun 29, 2022 · Operations

Understanding DNS Load Balancing, CDN, and SOA Mechanisms

This article explains the limitations of traditional load‑balancing, describes how CDNs and DNS use distributed, hierarchical mechanisms such as SOA to achieve traffic distribution and fault tolerance, and outlines practical DNS‑based load‑balancing implementations and supported service providers.

CDNDNSLoad Balancing
0 likes · 7 min read
Understanding DNS Load Balancing, CDN, and SOA Mechanisms
Alibaba Cloud Infrastructure
Alibaba Cloud Infrastructure
Feb 11, 2022 · Operations

Proactive Identification of Double 11 Transaction‑Surge Traffic Risks Using an AI‑Driven Network Monitoring Solution

The article presents a case study of how Alibaba Cloud’s network operations team tackled the massive, unpredictable traffic spikes of the 2021 Double 11 shopping festival by identifying transaction‑promotion traffic risks early through AI‑powered analysis, overcoming the limitations of manual rule‑based detection, and achieving precise, automated capacity risk control.

AICapacity PlanningCloud Computing
0 likes · 8 min read
Proactive Identification of Double 11 Transaction‑Surge Traffic Risks Using an AI‑Driven Network Monitoring Solution
ByteDance Web Infra
ByteDance Web Infra
Oct 13, 2021 · Operations

DNS Resolution Failure for goofy.app in Singapore Office Caused by DNSSEC Misconfiguration

An internal investigation revealed that the goofy.app domain could not be resolved from Singapore offices because a misconfigured DNSSEC DS record caused validation failures, while Chinese DNS resolvers ignored DNSSEC, leading to successful resolution there; removing the erroneous DS key restored global accessibility.

DNSDNSSECDomain Resolution
0 likes · 10 min read
DNS Resolution Failure for goofy.app in Singapore Office Caused by DNSSEC Misconfiguration
Alibaba Cloud Infrastructure
Alibaba Cloud Infrastructure
Oct 1, 2021 · Operations

Understanding Alibaba Cloud’s Infrastructure Network: History, Challenges, and Operational Practices

The article traces the evolution of China’s internet from the first email in 1987 to the digital‑economy era, explains the significance of the 2021 Key Information Infrastructure Security Regulation, and details Alibaba Cloud’s large‑scale network operation team, the technical and managerial challenges they face, and the safety‑first methodologies they employ to ensure stable, secure cloud infrastructure.

Cloud ComputingDigital Economyinfrastructure security
0 likes · 10 min read
Understanding Alibaba Cloud’s Infrastructure Network: History, Challenges, and Operational Practices
Efficient Ops
Efficient Ops
Jul 6, 2021 · Operations

Mastering TCP TIME-WAIT: When to Optimize and How

This article explains the purpose of the TCP TIME-WAIT state, the scenarios it protects against, common misconceptions, and practical Linux kernel tweaks—such as fast recycle, socket reuse, and tw_buckets settings—to manage TIME-WAIT efficiently on high‑concurrency servers.

LinuxPerformance TuningTCP
0 likes · 8 min read
Mastering TCP TIME-WAIT: When to Optimize and How
Top Architect
Top Architect
May 5, 2021 · Operations

Using autossh for Secure SSH Tunneling, Automatic Reconnection, and Port Forwarding

This article explains how autossh automates SSH connections, provides reliable automatic reconnection, and supports local, remote, and dynamic port forwarding on Linux systems, including installation methods, key command‑line options, example usages, service configuration for auto‑start, and scripting tips.

LinuxPort Forwardingautossh
0 likes · 9 min read
Using autossh for Secure SSH Tunneling, Automatic Reconnection, and Port Forwarding
DevOps
DevOps
Dec 16, 2020 · Operations

The Role of DevOps in 5G Deployment and Network Operations

This article explains how 5G expands communication beyond people to devices, outlines its three core scenarios, and argues that DevOps and agile practices are essential for overcoming the technical and operational challenges of deploying 5G networks.

5GDevOpsEdge Computing
0 likes · 5 min read
The Role of DevOps in 5G Deployment and Network Operations
Efficient Ops
Efficient Ops
Sep 26, 2019 · Operations

How to Enable Inter‑Subnet Communication Across Multiple Routers with Static Routes

This guide explains how to configure static routes and precise subnet masks so that computers on different subnets and behind separate routers can communicate with each other, covering single‑router, dual‑router, and multi‑router scenarios and the concept of route aggregation.

Networkingnetwork operationsrouter configuration
0 likes · 8 min read
How to Enable Inter‑Subnet Communication Across Multiple Routers with Static Routes