Tagged articles
14 articles
Page 1 of 1
Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
Oct 29, 2025 · Artificial Intelligence

How AI Powers Proactive Risk Detection in Massive Cloud Platforms

This article outlines Alibaba Cloud's AI‑driven "Smart Sentinel" system, which tackles the three major challenges of large‑scale cloud operations—hard‑to‑detect anomalies, alarm storms, and difficult root‑cause analysis—by deploying multi‑layered detection, intelligent alarm grading, and an end‑to‑end automated response loop.

anomaly detectioncloud computingintelligent monitoring
0 likes · 11 min read
How AI Powers Proactive Risk Detection in Massive Cloud Platforms
Efficient Ops
Efficient Ops
Jun 20, 2024 · Operations

How Intelligent Ops Platforms Transform Distributed Banking Systems

This article explains how Chinese commercial banks are adopting intelligent operation platforms to collect, analyze, and visualize distributed system data in real time, enabling rapid root‑cause detection, full‑link tracing, and automated solution recommendations for complex financial services.

BankingDistributed SystemsRoot Cause Analysis
0 likes · 8 min read
How Intelligent Ops Platforms Transform Distributed Banking Systems
Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
Nov 29, 2023 · Operations

How AIOps and DataOps Transform Big Data Operations: Lessons from ABM Platform

This article examines the challenges of big‑data operations, explains how DataOps and AIOps complement each other, and details the ABM intelligent operations architecture, platform components, and real‑world use cases such as Flink hotspot detection, ChatOps assistants, and dynamic MaxCompute resource optimization.

Big Data OperationsDataOpsaiops
0 likes · 11 min read
How AIOps and DataOps Transform Big Data Operations: Lessons from ABM Platform
Efficient Ops
Efficient Ops
Oct 6, 2023 · Operations

How China Post’s Next‑Gen IT Monitoring Platform Drives Smart Operations

The article details China Post’s new generation IT infrastructure intelligent operation monitoring platform, highlighting its architecture, data collection, stream‑batch processing, AI‑driven algorithms, and one‑stop portal, and explains how the solution exemplifies cutting‑edge digital transformation practices showcased at the 2023 China International Service Trade Fair.

AIAutomationBig Data
0 likes · 9 min read
How China Post’s Next‑Gen IT Monitoring Platform Drives Smart Operations
Huolala Tech
Huolala Tech
Aug 11, 2022 · Operations

How Huolala Built an AI‑Powered Intelligent Monitoring Platform at Scale

This article details Huolala's journey from basic monitoring to an AI‑driven intelligent observability platform, covering AIOps concepts, a comprehensive monitoring framework, practical implementations, automated alert analysis, lessons learned, and future directions for large‑scale operations.

DevOpsHuolalaObservability
0 likes · 18 min read
How Huolala Built an AI‑Powered Intelligent Monitoring Platform at Scale
AntTech
AntTech
Apr 29, 2022 · Operations

Alipay Double‑11 System Stability Practices: Distributed Architecture, Elastic Scaling, Service Mesh, Full‑Chain Load Testing, Intelligent Monitoring, and OceanBase

The presentation details Alipay's evolution through three stability phases—capacity, elastic cloud‑native architecture, and green computing—covering unit‑based deployment, elastic scaling, ServiceMesh, full‑chain load testing, intelligent monitoring, and the OceanBase distributed database, illustrating how these techniques achieved 99.99% availability during the 2021 Double‑11 peak.

Cloud NativeLoad TestingOceanBase
0 likes · 11 min read
Alipay Double‑11 System Stability Practices: Distributed Architecture, Elastic Scaling, Service Mesh, Full‑Chain Load Testing, Intelligent Monitoring, and OceanBase
58 Tech
58 Tech
Nov 4, 2019 · Operations

Intelligent Operations Practices: Multi‑Dimensional Anomaly Detection, Alarm Merging, Knowledge‑Graph Construction, and Root‑Cause Analysis

This article summarizes the keynote on intelligent operations presented at the 13th GOPS Global Operations Conference, covering multi‑dimensional anomaly detection, smart alarm aggregation, the construction of an operations knowledge graph, and AI‑driven root‑cause analysis techniques for large‑scale server environments.

Knowledge GraphOperationsRoot Cause Analysis
0 likes · 9 min read
Intelligent Operations Practices: Multi‑Dimensional Anomaly Detection, Alarm Merging, Knowledge‑Graph Construction, and Root‑Cause Analysis
58 Tech
58 Tech
Dec 26, 2018 · Operations

Overview of the 58 Intelligent Monitoring System and Its Multi‑Dimensional Architecture

The 58 Intelligent Monitoring System provides a flexible, 24/7, multi‑dimensional monitoring solution that covers network, server, system, application and business layers, incorporates AI‑driven prediction, anomaly detection, alarm merging, root‑cause analysis and self‑healing, and offers both PC and WeChat interfaces for operators.

AlertingAutomationOperations
0 likes · 16 min read
Overview of the 58 Intelligent Monitoring System and Its Multi‑Dimensional Architecture
Efficient Ops
Efficient Ops
Dec 11, 2018 · Operations

How Alibaba’s AI‑Powered Monitoring Tackles Complex Business Anomalies

In this talk, Alibaba senior tech expert Wang Zhaogang explains how intelligent monitoring, powered by machine‑learning algorithms and multi‑metric analysis, addresses the challenges of diverse business scenarios, enhances anomaly detection, improves root‑cause analysis, and shapes the future of smart operations.

OperationsRoot Cause Analysisanomaly detection
0 likes · 23 min read
How Alibaba’s AI‑Powered Monitoring Tackles Complex Business Anomalies
Architects' Tech Alliance
Architects' Tech Alliance
Sep 26, 2018 · Operations

How Goldeneye Enables Adaptive, Intelligent Business Monitoring at Scale

Goldeneye, Alibaba Mom's monitoring platform, uses big‑data pipelines, dynamic threshold prediction, mean‑shift change‑point detection, and automated metric discovery to replace manual alarm settings, reduce false alerts, and provide intelligent, scalable business monitoring across hundreds of services.

Big DataOperationsbusiness monitoring
0 likes · 19 min read
How Goldeneye Enables Adaptive, Intelligent Business Monitoring at Scale
Efficient Ops
Efficient Ops
Jan 8, 2018 · Operations

360° Intelligent IT Operations: From Scripts to AI‑Driven Automation

This article summarizes a GOPS 2017 Shanghai talk that outlines a comprehensive, data‑driven IT operations framework for large enterprises, covering management体系, business‑centric monitoring, big‑data log analysis, multi‑dimensional reporting, monitoring platform evolution, and autonomous fault‑healing with AI.

IT Operationsintelligent monitoring
0 likes · 17 min read
360° Intelligent IT Operations: From Scripts to AI‑Driven Automation
Efficient Ops
Efficient Ops
Dec 12, 2017 · Operations

Sogou’s AI‑Powered Ops: Smart Circuit Breaker, Fault Localization & Chatbot

This article examines the three major pain points faced by Sogou's operations engineers—worry cost, insufficient intelligence, and annoyance cost—and explains how the company applies AI through intelligent circuit breaking, fault localization, and a chatbot to streamline reliability and reduce manual effort.

ChatbotFault Localizationintelligent monitoring
0 likes · 10 min read
Sogou’s AI‑Powered Ops: Smart Circuit Breaker, Fault Localization & Chatbot