Tag

operations automation

0 views collected around this technical thread.

DataFunSummit
DataFunSummit
Apr 7, 2025 · Artificial Intelligence

Bridging the Gap Between Large Models and Real‑World Applications with RAG and Agents

This article examines how Retrieval‑Augmented Generation (RAG) and multi‑agent technologies narrow the gap between large language models and practical deployment, highlighting their roles in operations automation, financial risk control, intelligent data governance, database localization, edge inference, and future AI‑driven solutions.

AI applicationsAgentsData Governance
0 likes · 8 min read
Bridging the Gap Between Large Models and Real‑World Applications with RAG and Agents
vivo Internet Technology
vivo Internet Technology
Mar 5, 2025 · Cloud Native

Beidou Container Operations Management Platform: Architecture, Automation, and Capabilities

The Beidou Operations Management Platform, created by vivo’s Internet Server team, unifies management of over twenty Kubernetes clusters and tens of thousands of nodes, automates scaling, inspections, event collection, and Helm‑based application deployment, achieving more than 90% UI‑driven operations and dramatically improving stability and operational efficiency.

Container ManagementDevOpsKubernetes
0 likes · 20 min read
Beidou Container Operations Management Platform: Architecture, Automation, and Capabilities
Python Programming Learning Circle
Python Programming Learning Circle
Sep 15, 2024 · Operations

Using Python Scripts for Operations Automation: Remote Execution, Log Parsing, Monitoring, Deployment, and Backup

This article explains how operations engineers can leverage Python scripts and popular libraries such as paramiko, regex, psutil, fabric, and shutil to automate tasks like remote command execution, log analysis, system monitoring with alerts, batch software deployment, and file backup and recovery, enhancing efficiency and reducing manual errors.

Remote Executionoperations automationscripting
0 likes · 9 min read
Using Python Scripts for Operations Automation: Remote Execution, Log Parsing, Monitoring, Deployment, and Backup
Efficient Ops
Efficient Ops
Sep 8, 2024 · Operations

Boost Ops Efficiency: 5 Python Scripts Every Sysadmin Should Use

This article explains how Python can automate common operations tasks—remote command execution, log parsing, system monitoring with alerts, batch software deployment, and backup/restore—providing code examples and highlighting the benefits for sysadmins.

AutomationDevOpsPython
0 likes · 9 min read
Boost Ops Efficiency: 5 Python Scripts Every Sysadmin Should Use
DataFunSummit
DataFunSummit
Jul 15, 2024 · Operations

Intelligent Operations (AIOps) Insights, Planning, and Large‑Model Agent Practices at ByteDance

The article summarizes ByteDance's intelligent operations (AIOps) strategy, covering frontier concepts, a five‑level automation roadmap, large‑model applications for fault diagnosis and smart Q&A, and a comprehensive AIOps platform that accelerates algorithm deployment, improves efficiency, and reduces operational costs.

AI agentsAIOpsIntelligent Operations
0 likes · 21 min read
Intelligent Operations (AIOps) Insights, Planning, and Large‑Model Agent Practices at ByteDance
Efficient Ops
Efficient Ops
May 14, 2024 · Artificial Intelligence

How Large‑Model Agents Are Revolutionizing AIOps and Modern Operations

This article explores why large‑model Agent technology is essential for AIOps, explains single‑ and multi‑Agent architectures, memory and tool integration, and demonstrates practical applications such as anomaly detection, fault diagnosis, automated remediation, ChatOps, and future directions for intelligent, autonomous operations.

AI agentsAIOpsLLM
0 likes · 14 min read
How Large‑Model Agents Are Revolutionizing AIOps and Modern Operations
ByteDance SYS Tech
ByteDance SYS Tech
May 9, 2024 · Operations

How Large‑Model Agents Transform AIOps: From Automation to Self‑Healing Operations

The presentation explains how large‑model agents empower AIOps by automating routine tasks, enhancing anomaly detection, fault diagnosis, and remediation, while outlining architectural components, multi‑agent collaboration, and future directions for building self‑healing, observability‑driven operations platforms.

AIOpsSelf-healingagent
0 likes · 15 min read
How Large‑Model Agents Transform AIOps: From Automation to Self‑Healing Operations
DataFunSummit
DataFunSummit
Apr 21, 2024 · Operations

The Value, Challenges, and Future of AIOps in Modern Enterprises

AIOps leverages AI to automate IT monitoring, predict failures, and optimize resources, offering modern enterprises reduced operational workload and higher reliability, while facing challenges such as data governance, automation, hierarchical monitoring, and large‑model hallucinations that must be addressed for successful deployment.

AIOpsEnterprise ITIT Operations
0 likes · 2 min read
The Value, Challenges, and Future of AIOps in Modern Enterprises
Efficient Ops
Efficient Ops
Feb 26, 2024 · Operations

Measuring Ops Automation Rate and Building a Coding Platform with Taishan‑Qilin

This article explains how to measure the operations automation rate, outlines the challenges of manual ops, and provides a step‑by‑step guide to creating a coding‑based automation platform on Taishan‑Qilin, including formulas, code examples, deployment, and real‑world results.

CRDDevOpsKubernetes
0 likes · 20 min read
Measuring Ops Automation Rate and Building a Coding Platform with Taishan‑Qilin
JD Retail Technology
JD Retail Technology
Feb 20, 2024 · Operations

Measuring Operations Automation Rate and Building a Self‑Coding Automation Platform

This article explains the challenges of manual operations, defines an automation‑rate metric, introduces the Tai‑Shan Kirin platform for self‑coded operational automation, provides step‑by‑step implementation guidance with code examples, and shares a case study demonstrating significant efficiency and stability gains.

Automation MetricsCRDDevOps
0 likes · 19 min read
Measuring Operations Automation Rate and Building a Self‑Coding Automation Platform
vivo Internet Technology
vivo Internet Technology
Jun 28, 2023 · Operations

Certificate Management Platform Practice: From Manual to Platform-Based Operations at Scale

vivo replaced fragile, engineer‑driven certificate handling with a centralized Vue‑2/Go platform that automates application, secure key storage, renewal alerts, and multi‑environment pushes, eliminating availability incidents and paving the way for future blockchain‑based, immutable certificate distribution.

DevOpsPlatform DevelopmentSSL/TLS
0 likes · 7 min read
Certificate Management Platform Practice: From Manual to Platform-Based Operations at Scale
Efficient Ops
Efficient Ops
May 30, 2023 · Operations

Mastering Fault Self-Healing: Automate Disk Alerts and Scale Operations

Discover how to transform nightly disk‑space alerts into automated self‑healing workflows, covering prerequisite standards, multi‑dimensional monitoring, CMDB integration, script‑based remediation, and multi‑channel notifications to scale operations across thousands of servers without manual intervention.

CMDBDevOpsfault self-healing
0 likes · 10 min read
Mastering Fault Self-Healing: Automate Disk Alerts and Scale Operations
Efficient Ops
Efficient Ops
Mar 20, 2023 · Artificial Intelligence

How AI‑Powered Digital Employees Transform IT Operations and Boost Efficiency

This article describes how the Industrial and Commercial Bank of China's software development center created the AI‑driven digital employee “Ruan Xiaoyan,” detailing its functions, user touchpoints, and practical applications such as intelligent customer service, smart workflow automation, and proactive reminders that enhance IT operations efficiency.

AIRPAdigital employee
0 likes · 11 min read
How AI‑Powered Digital Employees Transform IT Operations and Boost Efficiency
Efficient Ops
Efficient Ops
Jan 9, 2023 · Operations

Boost Ops Efficiency: 5 Python Scripts Every Sysadmin Should Use

This article explains how Python can automate common operations tasks—remote command execution, log parsing, system monitoring with alerts, bulk software deployment, and backup/restore—providing code examples for each and highlighting additional tools that help sysadmins improve efficiency and reduce errors.

DeploymentPythonSysadmin Scripts
0 likes · 9 min read
Boost Ops Efficiency: 5 Python Scripts Every Sysadmin Should Use
Youzan Coder
Youzan Coder
Jun 25, 2021 · Operations

Building an Event-Driven Automated Operations Platform (Whale)

Whale is an event‑driven automated operations platform that lets developers package atomic tasks, users compose workflows, and a rule‑matching engine trigger them in real time via an event center, employing a StackStorm‑based execution engine for fault‑tolerant, cross‑datacenter orchestration and future AI‑enhanced self‑healing.

AIOpsDevOpsStackStorm
0 likes · 7 min read
Building an Event-Driven Automated Operations Platform (Whale)
Efficient Ops
Efficient Ops
Dec 10, 2019 · Operations

How Minsheng Bank Uses AIOps to Revolutionize Intelligent Operations

In this talk, the head of Minsheng Bank's intelligent operations platform shares the bank's journey of applying AIOps to tackle massive data, complex dependencies, and operational challenges, outlining the evolution of their technology stack, AI-driven processes, and practical use‑case scenarios.

AI in OperationsAIOpsBanking Technology
0 likes · 16 min read
How Minsheng Bank Uses AIOps to Revolutionize Intelligent Operations
360 Tech Engineering
360 Tech Engineering
Oct 31, 2019 · Operations

AIOps Implementation Practice at 360: Architecture, Models, and Automation

The article details 360's AIOps deployment, covering external speaker insights, internal architecture, data collection pipelines, AI models for resource recycling, alarm reduction, and correlation, as well as visualization dashboards, labeling platforms, and self‑healing mechanisms, illustrating a comprehensive AI‑driven operations framework.

AI monitoringAIOpsSelf-healing
0 likes · 14 min read
AIOps Implementation Practice at 360: Architecture, Models, and Automation
360 Tech Engineering
360 Tech Engineering
Sep 6, 2019 · Operations

StackStorm-Based ChatOps Solution for Automated Monitoring Alert Self‑Healing

This article introduces a StackStorm‑driven ChatOps framework that consolidates monitoring alerts, applies rule‑based root‑cause analysis, and automatically executes self‑healing actions, outlining its architecture, components, workflow definitions, and practical deployment results within an enterprise operations environment.

ChatOpsSelf-healingStackStorm
0 likes · 6 min read
StackStorm-Based ChatOps Solution for Automated Monitoring Alert Self‑Healing
360 Tech Engineering
360 Tech Engineering
Jul 12, 2019 · Operations

StackStorm‑Based Monitoring Alert Auto‑Remediation Solution

This article introduces a StackStorm‑driven monitoring and alert auto‑remediation architecture that converges alarms, performs root‑cause analysis, and executes self‑healing actions, detailing its components, workflow, configuration examples, and real‑world deployment outcomes.

Alert ConvergenceAuto‑RemediationStackStorm
0 likes · 7 min read
StackStorm‑Based Monitoring Alert Auto‑Remediation Solution
Efficient Ops
Efficient Ops
Aug 19, 2018 · Artificial Intelligence

How AIOps Transforms DevOps: Real-World Cases from Tencent

This article explores the emerging field of AIOps, comparing rule‑based operations with AI‑driven approaches, outlining a five‑level AIOps maturity model, and presenting several Tencent case studies that demonstrate cost reduction, quality improvement, root‑cause analysis, and automated scaling.

AIOpsCase StudyDevOps
0 likes · 18 min read
How AIOps Transforms DevOps: Real-World Cases from Tencent