Tagged articles
307 articles
Page 2 of 4
Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
Apr 17, 2023 · Operations

How to Break Through Scale‑Out Ops Bottlenecks in the Cloud‑Native Era

This article analyzes the three main bottlenecks—stability, cost, and efficiency—encountered in large‑scale operations, presents a six‑stage pipeline and open‑source toolchain, and explains how cloud‑native technologies such as Kubernetes and AIOps can transform and automate massive infrastructure management.

KubernetesScalabilityaiops
0 likes · 18 min read
How to Break Through Scale‑Out Ops Bottlenecks in the Cloud‑Native Era
DataFunSummit
DataFunSummit
Apr 15, 2023 · Operations

Observability and Intelligent Alert Management Practices

This presentation outlines the observability ecosystem, the role and value of alerts within it, core functionalities of an intelligent alarm management platform, best‑practice recommendations, and a real‑world case study of deploying a unified observability solution for a large state‑owned investment group.

Alert ManagementIT Operationsaiops
0 likes · 11 min read
Observability and Intelligent Alert Management Practices
Efficient Ops
Efficient Ops
Apr 14, 2023 · Operations

Agile Perception, Precise Decisions: AI‑Driven Smart Network Operations

At the 20th GOPS Global Operations Conference in Shenzhen, Huawei expert Liu Yuliang outlined how AI and data can transform telecom network management from a network‑centric to a business‑centric, self‑driving model, highlighting key solutions such as ChatOps, EDNS, AABD Pro, and cross‑vendor topology reconstruction.

AIDigital Twinaiops
0 likes · 5 min read
Agile Perception, Precise Decisions: AI‑Driven Smart Network Operations
Efficient Ops
Efficient Ops
Apr 7, 2023 · Operations

What Do China’s Latest DevOps & AIOps Maturity Assessments Reveal About Enterprise Success?

China's Information and Communication Research Institute announced the newest evaluation results for its DevOps and AIOps capability maturity models, showing that standardization and tool empowerment have helped over 75 leading enterprises across banking, securities, telecom, and internet sectors improve quality, efficiency, and market competitiveness.

DevOpsEnterpriseMaturity Model
0 likes · 8 min read
What Do China’s Latest DevOps & AIOps Maturity Assessments Reveal About Enterprise Success?
Efficient Ops
Efficient Ops
Mar 15, 2023 · Operations

How Human‑Machine Collaboration Is Redefining Operations with AIOps

The article explores how AIOps, a human‑machine collaborative approach powered by data, algorithms, and contextual knowledge, transforms modern operations by enabling real‑time insight, predictive decision‑making, automated execution, and continuous feedback, especially in complex, security‑sensitive environments like finance.

@DataOperationsaiops
0 likes · 11 min read
How Human‑Machine Collaboration Is Redefining Operations with AIOps
Efficient Ops
Efficient Ops
Mar 14, 2023 · Artificial Intelligence

How NetEase Games Built an AIOps Platform to Transform IT Operations

This article explains how NetEase Games leveraged AI, big data, and machine learning to create an AIOps platform that automates anomaly detection, log analysis, and fault localization, improving quality assurance, cost management, and operational efficiency across complex gaming infrastructures.

IT Operationsaiopsanomaly detection
0 likes · 12 min read
How NetEase Games Built an AIOps Platform to Transform IT Operations

How Time-Series Decomposition Boosts Microservice Root Cause Localization to 84% Accuracy

This paper presents StudRank, a microservice root‑cause localization method that decomposes call‑chain traces into time‑series, detects anomalies, builds an abnormal propagation graph, and applies a personalized PageRank random‑walk algorithm, achieving 84% top‑1 accuracy and a 97.6% improvement over MicroRCA on public AIOps data.

MicroservicesStudRankaiops
0 likes · 23 min read
How Time-Series Decomposition Boosts Microservice Root Cause Localization to 84% Accuracy
AntTech
AntTech
Mar 7, 2023 · Cloud Native

Introduction to HoloInsight: A Cloud‑Native Lightweight Observability Platform

HoloInsight is an open‑source, cloud‑native observability platform derived from Ant Group's AntMonitor, offering integrated log‑based monitoring, business metric analysis, and AI‑driven AIOps capabilities while providing a lightweight, modular architecture and extensive extensibility for modern software stacks.

Observabilityaiopscloud-native
0 likes · 13 min read
Introduction to HoloInsight: A Cloud‑Native Lightweight Observability Platform
Python Programming Learning Circle
Python Programming Learning Circle
Mar 6, 2023 · Operations

Intelligent Operations: AI‑Driven Anomaly Detection, Alarm Compression, and Log Analysis Techniques

This article presents an AI‑enhanced operations framework that combines metric anomaly detection, alarm compression, log anomaly detection, and intelligent analysis using machine learning methods such as DBSCAN clustering, SARIMAX modeling, Apriori association rules, and LSTM‑based log parsing to improve fault detection and reduce operational costs.

Operationsaiopsanomaly detection
0 likes · 15 min read
Intelligent Operations: AI‑Driven Anomaly Detection, Alarm Compression, and Log Analysis Techniques
Efficient Ops
Efficient Ops
Feb 22, 2023 · Operations

Zero‑Downtime Secrets: TT Voice’s Multi‑Cloud, AIOps & Resource Optimization

During the 2022 TT Voice Annual Summit, the technical team tackled stability, real‑time risk control, and resource utilization challenges by implementing strict change management, multi‑cloud high‑availability networking, AIOps‑driven monitoring, big‑data processing, and cloud‑native scaling strategies, ultimately delivering zero‑fault operation.

Resource Optimizationaiopscloud-native
0 likes · 15 min read
Zero‑Downtime Secrets: TT Voice’s Multi‑Cloud, AIOps & Resource Optimization
dbaplus Community
dbaplus Community
Feb 21, 2023 · Operations

How Standardized Application Monitoring Boosts Operational Efficiency

This article reviews G Bank's multi‑year journey to standardize application monitoring, detailing the methodology, models, metrics, automation mechanisms, and quantitative evaluation that together improve visibility, early fault detection, and overall operations management for both traditional and distributed systems.

MetricsOperationsaiops
0 likes · 18 min read
How Standardized Application Monitoring Boosts Operational Efficiency
dbaplus Community
dbaplus Community
Feb 6, 2023 · Operations

How Vivo Built a Scalable, Cloud‑Native Monitoring Platform for Millions of Services

This article outlines Vivo's multi‑year journey of designing, evolving, and operating a cloud‑native, AIOps‑enabled monitoring platform that supports tens of thousands of hosts, databases, containers, and services, detailing its architecture, challenges, and future directions for observability and reliability.

ObservabilityOperationsSystem Architecture
0 likes · 18 min read
How Vivo Built a Scalable, Cloud‑Native Monitoring Platform for Millions of Services
Efficient Ops
Efficient Ops
Jan 16, 2023 · Artificial Intelligence

How China Mobile’s AIOps Platform Achieved Top‑Tier Evaluation and What It Means for Intelligent Operations

This article explains the concept of AIOps, details China Mobile Information Technology's successful comprehensive‑level assessment of its centralized operations management platform's fault‑self‑healing module, shares insights from an interview with the project director, and introduces the national AIOps capability maturity model.

AI in ITCapability Maturity ModelChina Mobile
0 likes · 9 min read
How China Mobile’s AIOps Platform Achieved Top‑Tier Evaluation and What It Means for Intelligent Operations
Efficient Ops
Efficient Ops
Jan 16, 2023 · Operations

How China Mobile’s AIOps Evaluation Sets a New Benchmark for Intelligent IT Operations

China Mobile Information Technology’s Management Information Domain Operation Management System passed the comprehensive-level AIOps fault‑prediction assessment, highlighting the growing importance of AI‑driven operations, the new AIOps capability maturity model, and insights from a project‑manager interview on future development.

China MobileIT OperationsMaturity Model
0 likes · 8 min read
How China Mobile’s AIOps Evaluation Sets a New Benchmark for Intelligent IT Operations
Efficient Ops
Efficient Ops
Jan 16, 2023 · Operations

How China Mobile’s Disk‑Health AI System Earned Top Marks in AIOps Evaluation

This article explains the AIOps concept, details China Mobile Information Technology's award‑winning Disk Health Intelligent Detection System, and shares an interview with its cloud architect on the evaluation process, future plans, and the broader AIOps capability maturity model.

Capability Maturity ModelChina MobileDisk Health Monitoring
0 likes · 8 min read
How China Mobile’s Disk‑Health AI System Earned Top Marks in AIOps Evaluation
Efficient Ops
Efficient Ops
Jan 16, 2023 · Operations

How China Mobile’s Centralized AIOps Platform Achieved Top‑Tier Evaluation

This article details China Mobile Information's interview about their centralized AIOps platform, the recent excellent‑level assessment by the China Academy of Information and Communications Technology, the system's key modules, future plans, and the broader significance of AI‑driven IT operations.

AutomationIT OperationsRoot Cause Analysis
0 likes · 11 min read
How China Mobile’s Centralized AIOps Platform Achieved Top‑Tier Evaluation
Efficient Ops
Efficient Ops
Jan 16, 2023 · Operations

How China Mobile’s AIOps Tools Achieved Top‑Tier Evaluation and What It Means for Smart Operations

The article explains AIOps concepts, details China Mobile Information's award‑winning intelligent operations tools, shares an interview with deputy general manager Liang Enlei on their development, evaluation experience, future plans, and introduces the national AIOps maturity model and its key modules.

Capacity ForecastingIT OperationsKnowledge Base
0 likes · 11 min read
How China Mobile’s AIOps Tools Achieved Top‑Tier Evaluation and What It Means for Smart Operations
Efficient Ops
Efficient Ops
Jan 9, 2023 · Operations

How Guotai Junan’s AIOps Platform Achieved Top‑Tier Evaluation in Intelligent Operations

Guotai Junan’s Intelligent Operations Service Platform, powered by AI‑driven AIOps, passed the China Academy of Information and Communications Technology’s excellence assessment for anomaly detection, showcasing advanced data‑driven monitoring, digital‑transformation initiatives, and future plans for fault prediction, self‑healing, and comprehensive operations intelligence.

Digital TransformationIT OperationsIntelligent Operations
0 likes · 15 min read
How Guotai Junan’s AIOps Platform Achieved Top‑Tier Evaluation in Intelligent Operations
Efficient Ops
Efficient Ops
Jan 9, 2023 · Operations

Guotai Junan’s AIOps Success: Inside the Award‑Winning Intelligent Operations Platform

The article explains how AIOps—AI‑driven IT operations—has become a strategic trend, details Guotai Junan’s award‑winning intelligent operations platform that achieved the top‑level “exception detection” evaluation, and shares interview insights on implementation, challenges, and future directions.

Digital TransformationIT OperationsIntelligent Operations
0 likes · 16 min read
Guotai Junan’s AIOps Success: Inside the Award‑Winning Intelligent Operations Platform
vivo Internet Technology
vivo Internet Technology
Jan 4, 2023 · Artificial Intelligence

Root Cause Localization Algorithm and Its Implementation for Service Fault Diagnosis

The article describes a root‑cause localization algorithm implemented in vivo’s monitoring platform that automatically analyzes latency spikes by splitting service timelines, computing variance, clustering results with K‑means, and recursively tracing downstream services, achieving over 85 % accuracy for dependency failures while still requiring human verification and outlining future AI‑driven enhancements.

Fault LocalizationK-MeansRoot Cause Analysis
0 likes · 13 min read
Root Cause Localization Algorithm and Its Implementation for Service Fault Diagnosis
Efficient Ops
Efficient Ops
Dec 30, 2022 · Operations

How China Agricultural Bank Earned Top AIOps Rating – Inside the Evaluation

An interview with senior leaders of China Agricultural Bank reveals how their AIOps‑driven operations platform achieved an Excellent rating in the CAICT root‑cause analysis module, showcasing the bank’s intelligent operations strategy, implementation details, and future plans for expanding AI‑based monitoring across cloud and micro‑service environments.

AIDigital TransformationIT Operations
0 likes · 9 min read
How China Agricultural Bank Earned Top AIOps Rating – Inside the Evaluation
Architecture Digest
Architecture Digest
Dec 30, 2022 · Operations

Vivo Monitoring Platform: Architecture, Evolution, and Future Directions

The article details the evolution, architecture, capabilities, challenges, and future plans of Vivo's comprehensive monitoring platform, covering its transition from simple Zabbix setups to a cloud‑native, AI‑ops enabled system that ensures service availability across massive infrastructure.

ObservabilityReliabilityaiops
0 likes · 16 min read
Vivo Monitoring Platform: Architecture, Evolution, and Future Directions
vivo Internet Technology
vivo Internet Technology
Dec 28, 2022 · Operations

Monitoring Service System Construction and Exploration Practice

The article outlines vivo’s evolution from simple Zabbix monitoring to a self‑built, unified monitoring platform that now covers infrastructure, containers, databases and user experience at massive scale, integrating AI‑ops, cloud‑native observability and unified alerting to ensure end‑to‑end service reliability and future intelligent, one‑stop monitoring.

Vivoaiopsarchitecture
0 likes · 28 min read
Monitoring Service System Construction and Exploration Practice
Efficient Ops
Efficient Ops
Dec 27, 2022 · Operations

What Is AIOps? Exploring the New AI‑Driven Operations Maturity Model

AIOps combines AI techniques such as machine learning with data science to enhance IT operations, and the Chinese CAICT has released the first international AIOps capability maturity model, detailing evaluation criteria, modules like anomaly detection and root‑cause analysis, and announcing recent certified enterprises.

IT OperationsMaturity Modelaiops
0 likes · 4 min read
What Is AIOps? Exploring the New AI‑Driven Operations Maturity Model
Efficient Ops
Efficient Ops
Dec 27, 2022 · Operations

How China Agricultural Bank Reached Industry‑Leading DevOps and AIOps Maturity

China Agricultural Bank’s R&D center shares its journey of passing multiple CAICT DevOps and AIOps maturity assessments, detailing the evaluated projects, metrics improvements, challenges overcome, and future plans, illustrating how standardized, tool‑enabled practices boost quality, efficiency, security, and digital transformation in a large financial institution.

DevOpsDigital TransformationFinancial Services
0 likes · 16 min read
How China Agricultural Bank Reached Industry‑Leading DevOps and AIOps Maturity
Efficient Ops
Efficient Ops
Dec 26, 2022 · Operations

What Is AIOps? Exploring China’s New AI‑Driven Operations Maturity Model

The article introduces the AIOps (Artificial Intelligence for IT Operations) capability maturity model developed by China’s Information and Communication Research Institute, explains its two parts—general capabilities and system/tool technical requirements—lists the evaluated modules, and announces the upcoming certification ceremony and contact details for participation.

IT OperationsMaturity ModelOperations
0 likes · 5 min read
What Is AIOps? Exploring China’s New AI‑Driven Operations Maturity Model
Efficient Ops
Efficient Ops
Dec 26, 2022 · Operations

China Agricultural Bank’s DevOps & AIOps Success: Key Lessons for Enterprises

China Agricultural Bank’s recent DevOps and AIOps assessments, covering 17 projects across continuous delivery, security, application design, and intelligent operations, showcase how standardized processes, tool empowerment, and rigorous evaluation boosted efficiency, safety, and digital transformation, offering actionable insights for large enterprises seeking similar maturity.

DevOpsDigital TransformationEnterprise Standards
0 likes · 16 min read
China Agricultural Bank’s DevOps & AIOps Success: Key Lessons for Enterprises
Efficient Ops
Efficient Ops
Dec 19, 2022 · Operations

How Tencent CDN Achieves Seamless Business Continuity with AI‑Powered SRE

This article details Tencent CDN's challenges and solutions for business continuity, covering bandwidth and device resource constraints, massive request handling, fault‑management lifecycle, automation bottlenecks, and the implementation of AIOps, intelligent alerts, capacity planning, and root‑cause analysis to ensure reliable service.

AutomationCDNOperations
0 likes · 21 min read
How Tencent CDN Achieves Seamless Business Continuity with AI‑Powered SRE
Tencent Architect
Tencent Architect
Nov 28, 2022 · Operations

How Tencent CDN Achieves Business Continuity with Intelligent Operations

This article details Tencent CDN's extensive business continuity challenges—including bandwidth, device resources, and massive request volumes—and explains how a fault‑management lifecycle, AIOps components, intelligent alerting, and automated capacity planning together enable resilient, automated operations.

CDNIntelligent OperationsSRE
0 likes · 17 min read
How Tencent CDN Achieves Business Continuity with Intelligent Operations
DataFunTalk
DataFunTalk
Nov 27, 2022 · Operations

Best Practices for Full‑Stack Operations Monitoring and Cost Reduction Using Alibaba Cloud Elasticsearch

This article presents a comprehensive, three‑part guide on the current state of full‑stack operations monitoring, common challenges and solutions, and a real‑world use case, illustrating how Alibaba Cloud Elasticsearch can improve observability, boost performance, and cut costs for complex distributed systems.

Cost OptimizationElasticsearchObservability
0 likes · 13 min read
Best Practices for Full‑Stack Operations Monitoring and Cost Reduction Using Alibaba Cloud Elasticsearch
Efficient Ops
Efficient Ops
Nov 16, 2022 · Operations

Building a 99.95% Uptime Cloud‑Native Platform: Guoxin Securities’ Ops Journey

Guoxin Securities’ QianKun centralized operation platform showcases a cloud‑native, micro‑service architecture that achieved 99.95% availability through containerization, multi‑region deployment, AI‑driven capacity forecasting, and comprehensive DevOps practices, offering a 24/7 seamless account‑opening experience and setting industry benchmarks.

Cloud NativeDevOpsOperations
0 likes · 14 min read
Building a 99.95% Uptime Cloud‑Native Platform: Guoxin Securities’ Ops Journey
dbaplus Community
dbaplus Community
Nov 8, 2022 · Databases

Mastering MySQL Performance: 5 Key Issues and Proven Tuning Strategies

This comprehensive guide outlines five common MySQL performance problems, a step‑by‑step investigation methodology, detailed Java middleware and database analyses, practical tuning tactics such as index optimization and sharding, and a governance framework for sustainable performance management.

DevOpsTuningaiops
0 likes · 23 min read
Mastering MySQL Performance: 5 Key Issues and Proven Tuning Strategies
Efficient Ops
Efficient Ops
Oct 31, 2022 · Operations

Key Takeaways from the 2022 GOPS Global Operations Conference Shanghai – DevOps, AIOps & Cloud Insights

The two‑day 2022 GOPS Global Operations Conference in Shanghai featured 16 tracks, over 80 speakers, new DevOps standards, extensive assessment results, and a wealth of sessions on DevOps, AIOps, cloud‑native practices, security, and industry case studies, offering a comprehensive snapshot of modern operations engineering.

DevOpsOperationsaiops
0 likes · 14 min read
Key Takeaways from the 2022 GOPS Global Operations Conference Shanghai – DevOps, AIOps & Cloud Insights
Efficient Ops
Efficient Ops
Oct 28, 2022 · Operations

How China Mobile Cloud Achieved Leading‑Edge AIOps Maturity

China Mobile Cloud’s Intelligent Operations (AIOps) platform passed the CAICT comprehensive evaluation, reaching industry‑leading L2 maturity, and the interview reveals the project’s goals, outcomes, and future plans for advancing AI‑driven IT operations.

IT OperationsMaturity Assessmentaiops
0 likes · 8 min read
How China Mobile Cloud Achieved Leading‑Edge AIOps Maturity
DataFunSummit
DataFunSummit
Oct 24, 2022 · Databases

Intelligent Operations: Challenges and Solutions with the IoTDB Time‑Series Database

This article examines the data challenges faced by intelligent operations (AIOps), evaluates IoTDB against other time‑series databases through performance benchmarks, outlines Cloudwise's architecture and open‑source contributions, and presents real‑world case studies demonstrating anomaly detection and root‑cause analysis in industrial settings.

Big DataIoTDBTime Series Database
0 likes · 15 min read
Intelligent Operations: Challenges and Solutions with the IoTDB Time‑Series Database
Efficient Ops
Efficient Ops
Sep 28, 2022 · Operations

How Event‑Driven Alert Centers Revolutionize Intelligent Operations

This article presents a comprehensive overview of an event‑centric intelligent alert analysis platform, covering its evolution, core challenges, the concept of alert events, AI‑driven correlation techniques, and the MC‑Stack platform that powers modern operations.

Alert Managementaiopsevent-driven monitoring
0 likes · 13 min read
How Event‑Driven Alert Centers Revolutionize Intelligent Operations
NetEase Game Operations Platform
NetEase Game Operations Platform
Sep 19, 2022 · Artificial Intelligence

Applying AIOps to Game Operations: Roadmap, Anomaly Detection, and Fault Localization

This article describes NetEase's AIOps journey for game operations, explaining the Gartner definition of intelligent operations, the implementation roadmap, detailed anomaly‑detection techniques for business, performance, and log data, and a comprehensive fault‑localization workflow that combines resource, code, and historical analysis.

Fault Localizationaiopsanomaly detection
0 likes · 12 min read
Applying AIOps to Game Operations: Roadmap, Anomaly Detection, and Fault Localization
Huolala Tech
Huolala Tech
Aug 11, 2022 · Operations

How Huolala Built an AI‑Powered Intelligent Monitoring Platform at Scale

This article details Huolala's journey from basic monitoring to an AI‑driven intelligent observability platform, covering AIOps concepts, a comprehensive monitoring framework, practical implementations, automated alert analysis, lessons learned, and future directions for large‑scale operations.

DevOpsHuolalaObservability
0 likes · 18 min read
How Huolala Built an AI‑Powered Intelligent Monitoring Platform at Scale
Efficient Ops
Efficient Ops
Aug 9, 2022 · Operations

How ICBC Accelerated Digital Transformation with XOps: From DevOps to MLOps

ICBC’s software development center outlines its multi‑year journey adopting XOps practices—DevOps, DevSecOps, DataOps, MLOps, AIOps, ChatOps and BizDevOps—to boost development efficiency, enhance security, accelerate data‑driven AI, and cut costs, showcasing measurable improvements in release frequency, defect rates, and operational automation.

DataOpsDevOpsDigitalTransformation
0 likes · 13 min read
How ICBC Accelerated Digital Transformation with XOps: From DevOps to MLOps
dbaplus Community
dbaplus Community
Aug 7, 2022 · Databases

Overcoming Performance and Compatibility Gaps Switching from Oracle to Chinese Databases

In this interview, senior database expert Kong Zaihua explains the main performance, functional, and usability shortcomings of domestic Chinese databases compared with Oracle, and outlines practical strategies, tools, and migration techniques to evaluate compatibility, handle DDL conversion, manage online/offline data transfer, and reduce reliance on stored procedures.

Domestic DatabasesOracle compatibilityaiops
0 likes · 10 min read
Overcoming Performance and Compatibility Gaps Switching from Oracle to Chinese Databases
Efficient Ops
Efficient Ops
Jul 29, 2022 · Operations

Key Takeaways from the 2022 XOps Industry Summit: DevOps, FinOps & AIOps Insights

The 2022 inaugural XOps Industry Ecosystem Summit in Beijing gathered over six hundred thousand viewers to explore digital governance, DevOps, FinOps, AIOps and related standards, featuring keynote speeches, forum discussions, assessment result releases, and the launch of the Digital Governance Alliance and FinOps alliance.

DevOpsDigital GovernanceFinOps
0 likes · 16 min read
Key Takeaways from the 2022 XOps Industry Summit: DevOps, FinOps & AIOps Insights
Efficient Ops
Efficient Ops
Jul 28, 2022 · Operations

What Are the 2022 XOps Top 10 Keywords Shaping Modern IT Operations?

At the 2022 XOps Industry Ecosystem Summit in Beijing, the China Academy of Information and Communications Technology unveiled the "2022 XOps Top Ten Keywords", highlighting trends such as DevOps, DevSecOps, CO‑DevOps, Continuous Testing, R&D efficiency metrics, AIOps, BizDevOps, FinOps, ArchOps and Digital Governance that are driving the evolution of modern IT operations.

BizDevOpsDevOpsDigital Transformation
0 likes · 13 min read
What Are the 2022 XOps Top 10 Keywords Shaping Modern IT Operations?
Efficient Ops
Efficient Ops
Jul 28, 2022 · Operations

How China’s New DevOps Standards Are Accelerating Enterprise Efficiency

The article reports on the 2022 XOps Industry Ecosystem Summit hosted by CAICT, announces the results of multiple DevOps and AIOps capability maturity model assessments across dozens of leading enterprises, explains the standards' origins and scope, and provides contact details for further evaluation inquiries.

Capability Maturity ModelDevOpsEnterprise Evaluation
0 likes · 9 min read
How China’s New DevOps Standards Are Accelerating Enterprise Efficiency
dbaplus Community
dbaplus Community
Jul 21, 2022 · Operations

How Huolala Built an AI‑Powered End‑to‑End Monitoring Platform

This article details Huolala's journey from a fragmented monitoring stack to a unified, AI‑enhanced observability platform, covering AIOps concepts, the design of a comprehensive monitoring framework, concrete implementation of metrics, tracing, logging, alerting, and lessons learned for large‑scale operations.

DevOpsObservabilityaiops
0 likes · 19 min read
How Huolala Built an AI‑Powered End‑to‑End Monitoring Platform
AntTech
AntTech
Jun 28, 2022 · Operations

AntMonitor: Evolution, Features, and Core Technologies of Ant Group’s Observability Platform

The article details Ant Group’s AntMonitor observability platform, covering its development timeline, holographic monitoring capabilities, integrated performance analysis, efficient data integration, built‑in AI‑driven analytics, Monitoring‑as‑a‑Service, and the underlying high‑performance time‑series database and cloud‑native architecture that support massive real‑time data processing.

CloudNativeObservabilityTimeSeriesDatabase
0 likes · 17 min read
AntMonitor: Evolution, Features, and Core Technologies of Ant Group’s Observability Platform
Efficient Ops
Efficient Ops
Jun 14, 2022 · Operations

Unlocking XOps: From DevOps Metrics to AIOps, BizDevOps, and FinOps

This article summarizes Professor Niu Xiaoling’s GNSEC 2022 keynote, outlining the XOps framework and its five pillars—XOps, DevOps research and operational efficiency metrics, AIOps, BizDevOps, and FinOps—while detailing their drivers, maturity models, implementation examples, and the role of standards in guiding enterprises toward intelligent, cost‑effective, and business‑value‑focused software delivery.

BizDevOpsDevOpsFinOps
0 likes · 18 min read
Unlocking XOps: From DevOps Metrics to AIOps, BizDevOps, and FinOps
Efficient Ops
Efficient Ops
Jun 1, 2022 · Operations

What Can Aircraft Monitoring Teach Us About Building Effective IT Operations Monitoring?

The article explores how aviation‑grade monitoring concepts—such as multi‑level alarm classification, diverse alert delivery methods, and comprehensive sensor coverage—can inspire centralized, data‑driven IT operations monitoring architectures that reduce missed alerts, false positives, and improve response times.

Alert ManagementDigital Twinaiops
0 likes · 33 min read
What Can Aircraft Monitoring Teach Us About Building Effective IT Operations Monitoring?
Efficient Ops
Efficient Ops
May 30, 2022 · Operations

How AIOps Transforms Enterprise Operations: Insights from China’s 2022 Tech Salon

The 2022 online AIOps Technology Salon hosted by the China Academy of Information and Communications Technology gathered over 13,000 viewers, featured expert talks on standards, practical implementations, anomaly‑detection algorithms, and real‑world case studies from major enterprises, offering actionable insights for modern IT operations.

EnterpriseIT OperationsTech Talk
0 likes · 4 min read
How AIOps Transforms Enterprise Operations: Insights from China’s 2022 Tech Salon
Efficient Ops
Efficient Ops
May 15, 2022 · Artificial Intelligence

What’s the Current State of AIOps in China? Join the 2022 Survey

The article introduces the 2022 China AIOps Current Situation Survey, explains how rapid data growth and digital transformation are driving intelligent operations, outlines the joint effort of over 60 enterprises to compile the report, and invites industry professionals to contribute their insights through an online questionnaire.

ChinaIT Operationsaiops
0 likes · 6 min read
What’s the Current State of AIOps in China? Join the 2022 Survey
Efficient Ops
Efficient Ops
Apr 17, 2022 · Operations

Inside China’s 2022 AIOps Survey: How Intelligent Operations Drive Digital Growth

China’s AIOps industry, highlighted in the 2022 “China AIOps Status Survey” led by CAICT and major telecom and cloud firms, examines the rise of intelligent operations since Gartner’s 2016 definition, detailing its role in digital transformation, infrastructure upgrades, and a detailed five‑month research timeline.

Digital TransformationIT Operationsaiops
0 likes · 4 min read
Inside China’s 2022 AIOps Survey: How Intelligent Operations Drive Digital Growth
DevOps
DevOps
Apr 12, 2022 · Operations

Understanding Observability: Core Concepts, SRE Methodology, AIOps, and Business Architecture

The article explains the rising importance of observability in modern operations, defines its control‑theory roots, breaks it down into metrics, traces and logs, and argues that successful implementation requires three pillars—SRE practices, AIOps algorithms, and deep business‑architecture knowledge—together with well‑designed SLOs and critical‑path mapping.

ObservabilitySREaiops
0 likes · 10 min read
Understanding Observability: Core Concepts, SRE Methodology, AIOps, and Business Architecture
Shopee Tech Team
Shopee Tech Team
Apr 7, 2022 · Operations

MDAP: A Multi‑Dimensional Real‑Time Monitoring and Analysis Platform for Mobile Applications

MDAP is a multi‑dimensional real‑time monitoring platform for mobile apps that gathers metrics, logs, and traces via lightweight SDKs, processes data through micro‑service back‑ends using Flink, Spark, and ClickHouse, applies intelligent analysis for smoothness scoring, memory‑snapshot optimization, stack de‑obfuscation, crash clustering, and URL templating, and aims to extend end‑to‑end observability and predictive issue detection.

aiopsdata analysismobile monitoring
0 likes · 26 min read
MDAP: A Multi‑Dimensional Real‑Time Monitoring and Analysis Platform for Mobile Applications
Efficient Ops
Efficient Ops
Mar 28, 2022 · Operations

Zhejiang Mobile’s AI‑Driven Self‑Healing: Pioneering Intelligent Network Operations

This article examines the challenges of intelligent telecom network operation, presents Zhejiang Mobile’s AI‑powered self‑healing practice—including process re‑design, system reconstruction, talent transformation, and measurable results—and outlines the AIOps maturity model and future outlook for digital network management.

Digital Transformationaiopsnetwork automation
0 likes · 11 min read
Zhejiang Mobile’s AI‑Driven Self‑Healing: Pioneering Intelligent Network Operations
Efficient Ops
Efficient Ops
Mar 17, 2022 · Operations

Inside China’s AIOps Standard: Key Insights from the 4th Draft Meeting

The article reports on the fourth draft discussion of China’s Cloud Computing Intelligent Operations (AIOps) Capability Maturity Model – Part 2, detailing the meeting’s participants, the finalized system and tool technical requirements, and the progress toward a comprehensive AIOps standard that addresses quality, cost, efficiency, and security across multiple functional modules.

Operationsaiopsartificial intelligence
0 likes · 5 min read
Inside China’s AIOps Standard: Key Insights from the 4th Draft Meeting
IT Architects Alliance
IT Architects Alliance
Mar 13, 2022 · Backend Development

Meituan Instant Logistics: Evolution of Distributed System Architecture and Practices

The article details Meituan's five‑year journey in instant logistics, describing how distributed, high‑concurrency system architecture evolved through layered upgrades, microservices, fault‑tolerance mechanisms, AI‑driven optimization, and AIOps platforms to achieve scalability, low latency, high availability, and cost efficiency.

Distributed SystemsMeituanMicroservices
0 likes · 12 min read
Meituan Instant Logistics: Evolution of Distributed System Architecture and Practices
Architect
Architect
Mar 12, 2022 · Backend Development

Meituan Instant Logistics: Evolution of Distributed System Architecture and Technical Challenges

The article describes Meituan's five‑year journey in instant logistics, detailing how its distributed, high‑concurrency architecture has evolved through layered upgrades, micro‑service adoption, and AI integration to achieve low latency, high availability, cost efficiency, and scalability while addressing challenges such as massive order matching, peak traffic, data consistency, and fault tolerance.

Distributed SystemsMeituanMicroservices
0 likes · 11 min read
Meituan Instant Logistics: Evolution of Distributed System Architecture and Technical Challenges
DevOps
DevOps
Jan 28, 2022 · Operations

Continuous Operations: Definition, Stages, and Practices

This article presents a comprehensive study of continuous operations, defining its meaning, outlining the three key stages of continuous deployment, operation, and feedback, reviewing ITIL and DevOps practices, and sharing real-world case studies from major tech companies to illustrate effective implementation.

Continuous OperationsDevOpsITIL
0 likes · 46 min read
Continuous Operations: Definition, Stages, and Practices
Efficient Ops
Efficient Ops
Dec 25, 2021 · Artificial Intelligence

How Zhejiang Mobile’s AIOps Achieved National‑Level Excellence in Fault Management

The article explains AIOps fundamentals, details Zhejiang Mobile’s successful assessment in the national AIOps capability maturity model, shares insights from an interview with the company’s network‑management deputy director, and outlines future plans and industry recommendations for AI‑driven IT operations.

Capability Maturity ModelIT OperationsZhejiang Mobile
0 likes · 9 min read
How Zhejiang Mobile’s AIOps Achieved National‑Level Excellence in Fault Management
Efficient Ops
Efficient Ops
Dec 16, 2021 · Artificial Intelligence

How AI-Powered AIOps is Shaping Cloud Service Management and Standards

AIOps, the application of AI to IT operations, is defined, its role in cloud service management outlined, and the recent ITU‑T SG13 plenary that adopted an international standard on AI‑based operation management is summarized, along with domestic industry contributions, maturity assessments, and the upcoming 2021 GOLF+ IT governance forum.

ITU-TIntelligent OperationsMaturity Model
0 likes · 5 min read
How AI-Powered AIOps is Shaping Cloud Service Management and Standards
Efficient Ops
Efficient Ops
Dec 6, 2021 · Operations

How Scenario‑Based AIOps Transforms IT Operations: Insights from GOPS 2023

The article summarizes a GOPS conference presentation by Dingmao Technology on AIOps scenario‑driven construction, detailing challenges, definition of scenarios, technical methods, roadmap planning, and future prospects, while showcasing practical examples and supporting technologies for intelligent IT operations.

Data IntegrationIT OperationsScenario-based
0 likes · 8 min read
How Scenario‑Based AIOps Transforms IT Operations: Insights from GOPS 2023
dbaplus Community
dbaplus Community
Dec 2, 2021 · Operations

How Cloud‑Native Is Redefining Operations: Expert Views on DevOps, AIOps and Automation

In this panel discussion, three seasoned operations leaders share how traditional IT operations evolve into cloud‑native practices, covering continuous iteration, container‑based automation, DevOps collaboration, observability, chaos engineering, and the strategic balance between specialization and versatility for modern SRE teams.

AutomationCloud Nativeaiops
0 likes · 20 min read
How Cloud‑Native Is Redefining Operations: Expert Views on DevOps, AIOps and Automation
Alibaba Cloud Native
Alibaba Cloud Native
Nov 3, 2021 · Operations

Unlocking Smart Anomaly Detection in Alibaba Cloud Prometheus

This article explains the fundamentals of time‑series anomaly detection, the limitations of static threshold rules in open‑source Prometheus, and how Alibaba Cloud Prometheus introduces template‑based and smart detection operators to handle spikes, periodic patterns, and data quality issues in AIOps scenarios.

Cloud NativePrometheusSmart Operator
0 likes · 11 min read
Unlocking Smart Anomaly Detection in Alibaba Cloud Prometheus
Efficient Ops
Efficient Ops
Nov 1, 2021 · Operations

How AIOps Is Transforming IT Operations: Inside the First Domestic Evaluation

An interview with Qingchuang Technology’s co‑founder reveals how their Sherlock AIOps platform passed the first domestic AIOps system and tool assessment, illustrating the role of AI‑driven alarm convergence, the significance of the CAICT maturity model, and future plans for intelligent IT operations across industries.

DevOpsIT OperationsMaturity Model
0 likes · 7 min read
How AIOps Is Transforming IT Operations: Inside the First Domestic Evaluation
Efficient Ops
Efficient Ops
Nov 1, 2021 · Operations

How AIOps Is Empowering Enterprise Digital Transformation

The article explains how AIOps, built on DevOps principles and leveraging AI and big‑data analytics, helps enterprises overcome governance challenges, improve operational efficiency, and accelerate digital transformation, highlighting standards, real‑world evaluations, and key benefits such as real‑time analysis and noise reduction.

DevOpsDigital TransformationIT Governance
0 likes · 7 min read
How AIOps Is Empowering Enterprise Digital Transformation
Efficient Ops
Efficient Ops
Oct 22, 2021 · Artificial Intelligence

Zhejiang Mobile’s AIOps Wins First‑Stage Evaluation – Insights for Future IT Ops

The article reports that Zhejiang Mobile’s AIOps system and tool modules, including anomaly detection and alarm convergence, successfully passed the first‑stage assessment by the China Academy of Information and Communications Technology, highlighting the growing importance of AI‑driven operations, sharing interview insights on implementation, benefits, and future plans.

DevOpsIT Operationsaiops
0 likes · 10 min read
Zhejiang Mobile’s AIOps Wins First‑Stage Evaluation – Insights for Future IT Ops
Efficient Ops
Efficient Ops
Oct 22, 2021 · Operations

How Guangdong Mobile’s AIOps Platform Passed the First‑Stage Maturity Assessment – Insights and Future Plans

The article explains the concept of AIOps, details Guangdong Mobile’s AI‑plus‑knowledge experience fault‑diagnosis platform passing the first‑stage AIOps maturity assessment, shares interview insights from senior managers, and outlines future development directions for intelligent IT operations.

DevOpsIT OperationsIntelligent Operations
0 likes · 11 min read
How Guangdong Mobile’s AIOps Platform Passed the First‑Stage Maturity Assessment – Insights and Future Plans
Efficient Ops
Efficient Ops
Oct 22, 2021 · Operations

AIOps Revolution in IT Ops: Huatai Securities' First-Evaluation Success

The article explains how AIOps leverages AI to enhance IT operations, details Huatai Securities' successful first‑batch evaluation by the China Academy of Information and Communications Technology, and outlines future plans and the emerging AIOps maturity model standards.

DevOpsIT OperationsMaturity Model
0 likes · 6 min read
AIOps Revolution in IT Ops: Huatai Securities' First-Evaluation Success
Efficient Ops
Efficient Ops
Oct 22, 2021 · Operations

What Is AIOps? Inside the First Certified AIOps System & Tool Evaluations

The article explains AIOps—AI‑driven IT operations—its Gartner‑defined role, the debut of the first certified AIOps system and tool assessments announced at the 2021 DevOps International Summit in Beijing, and provides details on the evaluated enterprises, modules, and the new AIOps Capability Maturity Model.

Capability Maturity ModelIT Operationsaiops
0 likes · 5 min read
What Is AIOps? Inside the First Certified AIOps System & Tool Evaluations
Efficient Ops
Efficient Ops
Aug 29, 2021 · Operations

How Digital Transformation Redefines IT and Operations Value in the Age of Intelligent Everything

This article explores the shift to an intelligent‑everything era, outlines how IT value is transmitted through digital transformation, and details the eight operational challenges and four key digital concepts that enable organizations to enhance risk protection, accelerate delivery, improve customer experience, and raise service quality.

IT valueaiops
0 likes · 23 min read
How Digital Transformation Redefines IT and Operations Value in the Age of Intelligent Everything
IT Architects Alliance
IT Architects Alliance
Aug 17, 2021 · Backend Development

Meituan Instant Logistics: Distributed System Architecture, Practices, and Future Challenges

The article details Meituan’s five‑year evolution of its instant logistics platform, describing the distributed backend architecture, AI‑driven optimization, scalability and high‑availability practices, as well as future challenges in microservice complexity and operational automation.

Distributed SystemsLogisticsMicroservices
0 likes · 10 min read
Meituan Instant Logistics: Distributed System Architecture, Practices, and Future Challenges
ITFLY8 Architecture Home
ITFLY8 Architecture Home
Aug 17, 2021 · Backend Development

How Meituan Scaled Instant Logistics with Distributed Systems and AI

This article details Meituan's five‑year journey building a high‑availability, low‑latency instant logistics platform, describing the distributed architecture evolution, AI‑driven optimizations, fault‑tolerance techniques, and future challenges in scaling micro‑services for massive order and rider volumes.

AI logisticsDistributed SystemsMicroservices
0 likes · 12 min read
How Meituan Scaled Instant Logistics with Distributed Systems and AI
Efficient Ops
Efficient Ops
Jul 20, 2021 · Operations

How AI-Driven Operations (AIOps) Are Shaping Global Cloud Service Standards

Intelligent Operations (AIOps) applies AI to DevOps, analyzing logs and monitoring data to enable root‑cause analysis, fault prediction, and capacity planning, and its new international standard, approved at the ITU‑T SG13 meeting, defines functional requirements and architecture for AI‑based cloud service management, marking a milestone for global adoption.

IT StandardsIntelligent OperationsOperations Management
0 likes · 6 min read
How AI-Driven Operations (AIOps) Are Shaping Global Cloud Service Standards
Youzan Coder
Youzan Coder
Jun 25, 2021 · Operations

Building an Event-Driven Automated Operations Platform (Whale)

Whale is an event‑driven automated operations platform that lets developers package atomic tasks, users compose workflows, and a rule‑matching engine trigger them in real time via an event center, employing a StackStorm‑based execution engine for fault‑tolerant, cross‑datacenter orchestration and future AI‑enhanced self‑healing.

DevOpsEvent-drivenOperations Automation
0 likes · 7 min read
Building an Event-Driven Automated Operations Platform (Whale)
Baidu Geek Talk
Baidu Geek Talk
Jun 15, 2021 · Industry Insights

What Baidu Unveiled at QCon 2021: Key Takeaways from 7 Cutting‑Edge Sessions

This article compiles Baidu experts' presentations at QCon 2021, covering unified quality‑efficiency delivery for feed recommendation, software engineering capabilities, AIOps fault‑management practices, Apache Doris real‑time analytics, large‑scale Service Mesh deployment, massive service‑governance techniques, and deep‑learning platform innovations, with speaker details and audience benefits.

AIBaiduBig Data
0 likes · 12 min read
What Baidu Unveiled at QCon 2021: Key Takeaways from 7 Cutting‑Edge Sessions
DevOps
DevOps
Jun 10, 2021 · Operations

Operations Is Not Simple: Challenges, Methodologies, and Paths to Sustainable Improvement

This article explores the complexity of IT operations, outlining common misconceptions, essential capabilities, organizational and individual pain points, and presents self‑help strategies such as SRE, DevOps, automation, and AIOps to achieve sustainable, scalable, and intelligent operations within enterprises.

AutomationDevOpsSRE
0 likes · 28 min read
Operations Is Not Simple: Challenges, Methodologies, and Paths to Sustainable Improvement
Efficient Ops
Efficient Ops
Jun 1, 2021 · Artificial Intelligence

How Time‑Series Analysis Powers AIOps: Overcoming Real‑World Challenges

At the 16th GOPS Global Operations Conference, Shen Hui of DingMao Technology explained how time‑series data analysis underpins AIOps, outlining its four‑step workflow, key challenges, and the company’s three‑pipeline solution that enables trend forecasting, fault prediction, and a robust AI‑driven operational platform.

AIOperationsTime Series Analysis
0 likes · 7 min read
How Time‑Series Analysis Powers AIOps: Overcoming Real‑World Challenges
Efficient Ops
Efficient Ops
May 30, 2021 · Operations

How Intelligent Operations Are Redefining IT Management – Key Takeaways from the 2021 GOPS Conference

The 2021 GOPS Global Operations Conference in Shenzhen highlighted the shift toward intelligent, AI‑driven IT operations, presenting practical solutions, a three‑principle six‑step framework, and four core capabilities that help enterprises digitize, govern, and automate their operational data for higher efficiency.

Data GovernanceIT OperationsIntelligent Operations
0 likes · 7 min read
How Intelligent Operations Are Redefining IT Management – Key Takeaways from the 2021 GOPS Conference
Beijing SF i-TECH City Technology Team
Beijing SF i-TECH City Technology Team
May 17, 2021 · Artificial Intelligence

AIOps Overview: Concepts, Applications, and Case Studies

This article provides a comprehensive overview of AIOps, covering its definition, evolution from manual to AI-driven operations, core capabilities, and real-world applications in capacity prediction, anomaly detection, and alarm merging, illustrated with case studies from a food‑retail giant and internal logistics.

Big DataCapacity PredictionIT Operations
0 likes · 13 min read
AIOps Overview: Concepts, Applications, and Case Studies
Cloud Native Technology Community
Cloud Native Technology Community
Mar 25, 2021 · Operations

What Are the Top DevOps Trends Shaping 2021 and Beyond?

This article analyzes the most influential DevOps trends for 2021, including the rise of DevSecOps, AI‑driven AIOps, infrastructure automation, chaos engineering, serverless adoption, hybrid cloud, GitOps, and edge computing, backed by market forecasts and expert predictions.

CloudNativeDevOpsDevSecOps
0 likes · 10 min read
What Are the Top DevOps Trends Shaping 2021 and Beyond?
21CTO
21CTO
Mar 16, 2021 · Operations

How Cloud Computing Is Redefining Operations: Trends, Challenges, and Strategies

The article examines how the rapid adoption of cloud computing, DevOps, AIOps, and FinOps is reshaping the role of IT operations, highlighting new trends, evolving work boundaries, and the essential characteristics of a modern, automated, secure, and cost‑optimized operations system.

AutomationCost OptimizationDevOps
0 likes · 18 min read
How Cloud Computing Is Redefining Operations: Trends, Challenges, and Strategies
ITFLY8 Architecture Home
ITFLY8 Architecture Home
Mar 15, 2021 · Operations

How Meituan Scales Instant Delivery with a Distributed Architecture

Meituan's instant logistics platform evolved over five years, adopting distributed, fault‑tolerant systems, AI‑driven optimization, and multi‑IDC strategies to handle massive order volumes, extreme traffic spikes, and stringent real‑time reliability requirements while continuously improving scalability and cost efficiency.

AI OptimizationDistributed SystemsMicroservices
0 likes · 10 min read
How Meituan Scales Instant Delivery with a Distributed Architecture
DevOps Cloud Academy
DevOps Cloud Academy
Mar 13, 2021 · Operations

2021 DevOps Trends and Predictions: Microservices, DevSecOps, IA, AIOps, AgileOps, AI/ML, Kubernetes, and Cloud Management

The article outlines eight major 2021 DevOps trends—including the rise of microservices, increased DevSecOps adoption, infrastructure automation, predictive analytics in AIOps, AgileOps, AI/ML‑driven pipelines, Kubernetes integration, and cloud management platforms—highlighting their benefits and future impact on software delivery.

Cloud ManagementDevOpsDevSecOps
0 likes · 7 min read
2021 DevOps Trends and Predictions: Microservices, DevSecOps, IA, AIOps, AgileOps, AI/ML, Kubernetes, and Cloud Management
DevOps
DevOps
Feb 9, 2021 · Operations

Choosing Between DataOps, MLOps, and AIOps: A Guide for Data Teams

The article examines how data teams can select the appropriate Ops framework—DataOps, MLOps, or AIOps—by comparing their origins, principles, responsibilities, and tooling, and stresses that cultural principles outweigh technology choices for efficient delivery of data and machine‑learning products.

DataOpsDevOpsMLOps
0 likes · 12 min read
Choosing Between DataOps, MLOps, and AIOps: A Guide for Data Teams
Efficient Ops
Efficient Ops
Feb 7, 2021 · Artificial Intelligence

How NLP Transforms Big Data Operations: Real-World AIOps Case Studies

This article explores the intersection of natural language processing and operations, outlines common text‑handling challenges, and presents three concrete AIOps case studies—log Q&A, anomaly detection, and ticket recommendation—while reflecting on a closed‑loop AI workflow and future research directions.

Big DataNLPaiops
0 likes · 9 min read
How NLP Transforms Big Data Operations: Real-World AIOps Case Studies
JD Cloud Developers
JD Cloud Developers
Jan 15, 2021 · Artificial Intelligence

AIOps Revolution: From Manual Scripts to Intelligent IT Operations

Since Gartner introduced AIOps in 2016, the IT operations landscape has evolved through five stages—from manual scripting to standardized tools, platform automation, DevOps, and now AI-driven AIOps—enabling real-time anomaly detection, root‑cause analysis, noise reduction, and predictive maintenance through big data and machine learning.

AutomationIT Operationsaiops
0 likes · 11 min read
AIOps Revolution: From Manual Scripts to Intelligent IT Operations
JD Tech Talk
JD Tech Talk
Jan 8, 2021 · Artificial Intelligence

AIOps: Background, Scenarios, Capability Building, and Practical Implementation by JD Digital Operations Team

This article explains the evolution of IT operations toward AIOps, outlines its key scenarios, describes the team roles and capability‑building roadmap, and details JD Digital Operations' practical implementations—including fault detection, localization, and automated repair—leveraging AI, big data, and knowledge‑graph technologies.

AutomationIT Operationsaiops
0 likes · 12 min read
AIOps: Background, Scenarios, Capability Building, and Practical Implementation by JD Digital Operations Team
Efficient Ops
Efficient Ops
Dec 1, 2020 · Operations

Zero‑Downtime Ops: Inside Tencent’s Panshi High‑Availability Platform

At the 2020 GOPS Global Operations Conference, Tencent’s senior operations engineer Xie Hailin detailed the design and implementation of the Panshi platform—a comprehensive, high‑availability solution that unifies change management, fault handling, continuous operation, and disaster recovery to ensure uninterrupted payment services for billions of daily transactions.

Operationsaiopschange management
0 likes · 24 min read
Zero‑Downtime Ops: Inside Tencent’s Panshi High‑Availability Platform
High Availability Architecture
High Availability Architecture
Oct 22, 2020 · Artificial Intelligence

AIOps at Meituan: Architecture, Design, and Practice of the Horae Time‑Series Anomaly Detection System

This article presents Meituan's AIOps exploration, focusing on the design and implementation of the Horae time‑series anomaly detection platform, covering background, technical roadmap, fault‑discovery workflow, time‑series classification, feature engineering, model training, real‑time detection, and future directions.

HoraeMeituanaiops
0 likes · 31 min read
AIOps at Meituan: Architecture, Design, and Practice of the Horae Time‑Series Anomaly Detection System
Meituan Technology Team
Meituan Technology Team
Oct 15, 2020 · Artificial Intelligence

AIOps at Meituan: Architecture and Practice of Time‑Series Anomaly Detection (Part 1)

Meituan’s AIOps initiative replaces manual rule‑based monitoring with the Horae platform, which automatically classifies time‑series metrics, applies CNN and XGBoost models to detect periodic anomalies, achieves over 90 % precision in production, and paves the way for broader metric types, forecasting, and advanced fault‑localization.

HoraeMeituanOperations
0 likes · 33 min read
AIOps at Meituan: Architecture and Practice of Time‑Series Anomaly Detection (Part 1)