Tagged articles

Operations

3329 articles · Page 12 of 34

Jun 30, 2023 · Information Security

How WeChat’s Security Data Warehouse Powers Billions of Daily Feature Reads

This article explains the origins, evolution, and current architecture of WeChat’s security data warehouse, detailing its unified feature storage, data quality guarantees, multi‑IDC synchronization, and operational system that streamlines feature management, analysis, and deployment to support the platform’s massive security strategy.

Big DataData WarehouseFeature Management

0 likes · 15 min read

How WeChat’s Security Data Warehouse Powers Billions of Daily Feature Reads

MaGe Linux Operations

Jun 30, 2023 · Operations

What Went Wrong When Vipshop Crashed? Lessons on High‑Concurrency Failures

The article examines the March 29 Vipshop data‑center outage that caused over a billion‑yuan loss, explains the cooling‑system failure that triggered a 12‑hour P0 incident, discusses its impact on Tencent services, and analyzes why high‑concurrency crashes remain common, offering availability tier insights and mitigation strategies.

Incident ManagementOperationsavailability

0 likes · 7 min read

What Went Wrong When Vipshop Crashed? Lessons on High‑Concurrency Failures

Efficient Ops

Jun 29, 2023 · Operations

China Life (Overseas) Boosts DevOps Maturity: OnePartner Platform Success

China Life (Overseas) detailed how its OnePartner insurance marketing platform achieved advanced DevOps continuous‑delivery maturity through CAICT assessment, highlighting the benefits, challenges, and future plans of standardised, tool‑enabled digital transformation for the insurance industry.

Case StudyOperationscontinuous delivery

0 likes · 12 min read

China Life (Overseas) Boosts DevOps Maturity: OnePartner Platform Success

Efficient Ops

Jun 29, 2023 · Operations

How ICBC Trust Achieved Leading DevOps Maturity: A 3‑Level Continuous Delivery Success

ICBC Trust Fund Management’s Transaction Management Platform passed the CAICT DevOps Continuous Delivery Level 3 assessment, showcasing how standardized DevOps practices, tool empowerment, and cultural change dramatically cut build times, accelerate releases, and boost overall digital transformation efficiency.

Case StudyMaturity ModelOperations

0 likes · 14 min read

How ICBC Trust Achieved Leading DevOps Maturity: A 3‑Level Continuous Delivery Success

Efficient Ops

Jun 29, 2023 · Operations

How China’s CFFEX Tech Company Achieved Top‑Tier DevOps Continuous Delivery Rating

China Information Communication Research Institute announced that Shanghai Financial Futures Information Technology Co., the tech subsidiary of the China Financial Futures Exchange, passed the DevOps Continuous Delivery Level 3 assessment, marking the first such achievement among domestic securities and futures exchanges and showcasing how standardized DevOps practices can boost digital transformation, quality, and efficiency.

Case StudyOperationscontinuous delivery

0 likes · 15 min read

How China’s CFFEX Tech Company Achieved Top‑Tier DevOps Continuous Delivery Rating

Efficient Ops

Jun 29, 2023 · Operations

What Do the Latest DevOps Maturity Assessments Reveal About Chinese Enterprises?

The China Academy of Information and Communications Technology released the newest DevOps Capability Maturity Model evaluation results, showing that 78 leading firms across banking, finance, internet, and telecom sectors have collectively completed 224 projects, highlighting the impact of standardization and tool empowerment on enterprise competitiveness.

ChinaEnterpriseMaturity Model

0 likes · 5 min read

What Do the Latest DevOps Maturity Assessments Reveal About Chinese Enterprises?

Efficient Ops

Jun 29, 2023 · Operations

What Do the Latest DevOps Maturity Model Results Reveal About Enterprise Adoption?

The June 2023 release by China’s Academy of Information and Communications Technology details how 78 leading firms across banking, finance, telecom and internet sectors have passed the DevOps Capability Maturity Model assessments, highlighting the impact of standardized pipelines, tool empowerment and industry‑wide adoption on quality, efficiency and competitiveness.

Capability Maturity ModelEnterprise AssessmentOperations

0 likes · 6 min read

What Do the Latest DevOps Maturity Model Results Reveal About Enterprise Adoption?

Efficient Ops

Jun 29, 2023 · Operations

How China Life (Overseas) Reached Advanced DevOps Maturity and Boosted Digital Transformation

China Life (Overseas) passed the CAICT DevOps continuous delivery Level 2 assessment, showcasing how standardized DevOps practices and a one‑stop insurance marketing platform dramatically improved development efficiency, quality, and market competitiveness while highlighting challenges, outcomes, and future plans.

Case StudyOperationscontinuous delivery

0 likes · 11 min read

How China Life (Overseas) Reached Advanced DevOps Maturity and Boosted Digital Transformation

Architects' Tech Alliance

Jun 26, 2023 · Fundamentals

Understanding Linux Ext Filesystems, RAID, and LVM

This article explains the structure of Linux Ext (2/3/4) filesystems, detailing superblocks, inode tables and data blocks, then describes block groups, the role of superblocks, and outlines the differences between hardware and software RAID as well as the principles and risks of using LVM for flexible storage management.

FilesystemLVMLinux

0 likes · 5 min read

Understanding Linux Ext Filesystems, RAID, and LVM

Programmer DD

Jun 26, 2023 · Operations

What’s New in Grafana 10? Explore Correlations, Scenes, and Powerful New Panels

Grafana 10 introduces a suite of enhancements—including Correlations for cross‑data‑source linking, the Scenes front‑end library for building stunning dashboards, new Canvas, Trends, and Datagrid panels, CSV drag‑and‑drop support, sub‑folder organization, and improved data‑source selection—aimed at boosting analysis, collaboration, and efficiency for monitoring teams.

GrafanaNew FeaturesOperations

0 likes · 7 min read

What’s New in Grafana 10? Explore Correlations, Scenes, and Powerful New Panels

Efficient Ops

Jun 25, 2023 · Operations

How to Build a Next‑Gen “Big Operations” System for Reliability and Observability

This article outlines the evolution from manual operations to DevOps and SRE‑driven “big operations,” detailing system reliability and continuity practices, observability concepts, and the development of AIOps maturity standards, offering a comprehensive guide for building stable, efficient, and secure operational frameworks.

AIOpsObservabilityOperations

0 likes · 14 min read

How to Build a Next‑Gen “Big Operations” System for Reliability and Observability

Test Development Learning Exchange

Jun 25, 2023 · Operations

Building a Simple Network Monitoring Tool with Python and ping3

This tutorial explains how to develop a lightweight Python network monitoring utility using the ping3 library to send ICMP echo requests, measure latency, handle errors, and extend the script for batch pinging, timeout configuration, result logging, and packet‑loss calculation.

OperationsScriptingping

0 likes · 7 min read

Building a Simple Network Monitoring Tool with Python and ping3

Ops Development Stories

Jun 22, 2023 · Operations

How to Write an Ops Resume That Actually Gets You Interviews

The article examines three common resume pitfalls for operations candidates—unclear focus, breadth without depth, and vague personal planning—and offers concrete strategies to highlight strengths, showcase impactful projects, and present a clear career trajectory to attract interview opportunities.

Career AdviceOperationsjob interview

0 likes · 7 min read

How to Write an Ops Resume That Actually Gets You Interviews

Efficient Ops

Jun 20, 2023 · Operations

Mastering SRE: How Error Budgets and SLOs Drive System Reliability

This article explains the fundamentals of Site Reliability Engineering, detailing how SRE combines development and operations to improve stability through metrics like MTBF and MTTR, the roles of SLI/SLO, the VALET selection method, and the practical use of error budgets for quantifying work and guiding alerts.

Error BudgetMTBFOperations

0 likes · 14 min read

Mastering SRE: How Error Budgets and SLOs Drive System Reliability

JD Cloud Developers

Jun 14, 2023 · Operations

How to Ensure System Stability During Mega Sales Events like 618

This article examines the technical and operational challenges of the 618 shopping festival, presenting data‑driven insights and detailed strategies—including modular deployment, monitoring, logging, fast‑failure, rate limiting, database and cache optimizations, and emergency response plans—to help teams maintain system stability under massive traffic spikes.

Operationslarge‑scale promotionmonitoring

0 likes · 13 min read

How to Ensure System Stability During Mega Sales Events like 618

Open Source Linux

Jun 14, 2023 · Operations

Master Log Analysis: Fast Techniques to Pinpoint Errors in Massive Logs

This guide walks you through practical Linux commands—tail, head, cat, grep, sed, wc, and pagination tools—to quickly locate, filter, and examine specific error entries within large log files, boosting troubleshooting efficiency for system administrators.

LinuxOperationsTroubleshooting

0 likes · 11 min read

Master Log Analysis: Fast Techniques to Pinpoint Errors in Massive Logs

DevOps

Jun 13, 2023 · Operations

Why DevOps Is Not Dead: The Rise of Platform Engineering and Its Impact on Modern Operations

The article argues that DevOps is still alive, explains the shortcomings of isolated operational practices, introduces platform engineering as the next evolution, and discusses practical considerations such as third‑party software selection, cloud‑native adoption, and the role of internal developer platforms in improving organizational efficiency.

OperationsPlatform Engineeringcloud-native

0 likes · 10 min read

Why DevOps Is Not Dead: The Rise of Platform Engineering and Its Impact on Modern Operations

Architecture and Beyond

Jun 10, 2023 · Operations

What Is Systemic Risk in Technology and How to Manage It Effectively

The article explains the concept of systemic risk in both economics and technology, compares it with non‑systemic risk, describes how it propagates, lists common sources, outlines its impact on technical teams and business value, and provides a step‑by‑step framework for modeling, identifying, and governing such risks.

GovernanceOperationsrisk assessment

0 likes · 23 min read

What Is Systemic Risk in Technology and How to Manage It Effectively

Test Development Learning Exchange

Jun 9, 2023 · Operations

Common Linux Commands for System Stress Testing and Performance Monitoring

This article introduces a collection of essential Linux command‑line tools for stress testing and monitoring system resources such as CPU, memory, disk, network, processes, load, temperature, and other aspects, helping administrators assess stability and performance.

LinuxOperationsperformance monitoring

0 likes · 5 min read

Common Linux Commands for System Stress Testing and Performance Monitoring

Architecture & Thinking

Jun 9, 2023 · Backend Development

Why Do Message Queues Get Backlogged and How to Fix It Fast?

This article examines why message queues become backlogged—covering producer overload, broker persistence failures, and consumer bottlenecks—and outlines a step‑by‑step scaling and remediation strategy to restore smooth processing, including temporary queue expansion, load‑balanced forwarding, and post‑recovery cleanup.

BacklogOperationsscaling

0 likes · 6 min read

Why Do Message Queues Get Backlogged and How to Fix It Fast?

Qunar Tech Salon

Jun 8, 2023 · Operations

System Complexity Modeling and Anti‑Corruption Governance at Qunar

This article describes how Qunar's technology center defined, measured, and managed system complexity through a custom modeling framework, implemented a dashboard for continuous monitoring, and established an anti‑corruption governance process that limits complexity growth to maintain low maintenance costs across hundreds of applications and systems.

GovernanceOperationsQunar

0 likes · 14 min read

System Complexity Modeling and Anti‑Corruption Governance at Qunar

JD Cloud Developers

Jun 6, 2023 · Operations

How openKylin’s Community Board Drove Open‑Source Growth and Governance

The second openKylin community board meeting in Beijing detailed governance rules, controlled open‑source initiatives, open‑build infrastructure, innovation projects, ecosystem expansion, and the nomination of new board members, highlighting the community’s rapid growth, extensive SIG groups, and strategic plans for future development.

LinuxOpenKylinOperations

0 likes · 7 min read

How openKylin’s Community Board Drove Open‑Source Growth and Governance

Tongcheng Travel Technology Center

Jun 6, 2023 · Operations

Root Cause Analysis and GC Parameter Optimization for Elasticsearch OOM Issues in the Membership Service

This article details a comprehensive investigation of an out‑of‑memory crash in a critical Elasticsearch cluster, explains how GC logs and heap dumps revealed a to‑space‑exhausted condition, and describes the G1GC tuning parameters that eliminated the nightly spikes and stabilized performance.

ElasticsearchG1GCJVM Tuning

0 likes · 9 min read

Root Cause Analysis and GC Parameter Optimization for Elasticsearch OOM Issues in the Membership Service

Ops Development Stories

Jun 6, 2023 · Operations

When Fancy PPTs Meet Real Outages: Lessons from a Major E‑commerce Crash

The article examines Vipshop's massive March 2023 outage caused by an IDC cooling failure, critiques superficial PPT‑driven reliability claims, and offers practical SRE insights on fault drills, true multi‑active architectures, and how ops teams can gain influence despite budget constraints.

OperationsSREfault tolerance

0 likes · 7 min read

When Fancy PPTs Meet Real Outages: Lessons from a Major E‑commerce Crash

dbaplus Community

Jun 5, 2023 · Operations

Mastering Production Faults: Diagnose and Fix Network, Server, Database Issues

This guide outlines the most common production failures—including network, server, database, software, security, storage, configuration, and third‑party service issues—and provides step‑by‑step methods to detect, troubleshoot, and resolve each problem, helping maintain system stability and reliability.

OperationsProductionServer

0 likes · 30 min read

Mastering Production Faults: Diagnose and Fix Network, Server, Database Issues

Efficient Ops

Jun 1, 2023 · Operations

How Tencent’s On‑Call System Transforms Incident Management and Quality Ops

This article explores how Tencent builds and practices its SRE quality operation system, focusing on On‑Call incident management, standardized channels, alert handling, data quality models, and the resulting improvements in reliability, MTTR reduction, and data‑driven decision making.

ObservabilityOn-CallOperations

0 likes · 14 min read

How Tencent’s On‑Call System Transforms Incident Management and Quality Ops

Open Source Linux

May 30, 2023 · Operations

Essential Linux Ops Interview Questions & Answers for Sysadmins

A comprehensive collection of Linux operations interview questions covering topics such as system administration, RAID configurations, load balancing, middleware, MySQL troubleshooting, network monitoring, security, scripting, and best practices for optimizing and maintaining Linux servers.

LinuxLoadBalancingOperations

0 likes · 38 min read

Essential Linux Ops Interview Questions & Answers for Sysadmins

FunTester

May 30, 2023 · Operations

Software Performance Testing: Process, Tools, and Required Skills

The article explains why software performance testing is essential, outlines a comprehensive testing workflow, reviews popular load‑testing tools, offers guidance on selecting the right tool, and lists the technical and analytical skills needed to become an effective performance testing engineer.

Operationsload testingperformance testing

0 likes · 13 min read

Software Performance Testing: Process, Tools, and Required Skills

dbaplus Community

May 29, 2023 · Operations

How Bilibili Built a High‑Availability Multi‑Active Architecture for SRE

This article details Bilibili's SRE team's design and implementation of a high‑availability multi‑active architecture, covering zone types, same‑city and cross‑region deployments, traffic routing, cache consistency, message handling, governance, and practical lessons learned from real‑world incidents.

BilibiliOperationsSRE

0 likes · 20 min read

How Bilibili Built a High‑Availability Multi‑Active Architecture for SRE

Data Thinking Notes

May 28, 2023 · Operations

Why Do State‑Owned Enterprises Struggle with Digital Transformation? Key Challenges and Solutions

This analysis examines why Chinese state‑owned enterprises face unclear digital‑transformation goals, weak strategic positioning, fragmented data, talent shortages, and inadequate technology ecosystems, and it outlines the root causes, typical case studies, and recommended actions to achieve effective digital change.

Data GovernanceOperationsState‑owned enterprises

0 likes · 16 min read

Why Do State‑Owned Enterprises Struggle with Digital Transformation? Key Challenges and Solutions

Efficient Ops

May 28, 2023 · Operations

Essential Linux Ops Tools: Install & Use Nethogs, IOZone, IOTop and More

A concise guide for Linux administrators that introduces thirteen practical monitoring and security tools—ranging from network bandwidth trackers like Nethogs to vulnerability scanners like NMap—complete with installation steps, usage examples, and key configuration tips.

Network ToolsOperations

0 likes · 12 min read

Essential Linux Ops Tools: Install & Use Nethogs, IOZone, IOTop and More

MaGe Linux Operations

May 27, 2023 · Operations

Choosing the Right Log Collection Tool: Logstash vs Fluentd, Fluent Bit & Vector

This article compares four popular open‑source log collection tools—Logstash, Fluentd, Fluent Bit, and Vector—examining their key features, performance, resource usage, scalability, security, and ecosystem to help enterprises select the most suitable solution for their specific logging needs.

Fluent BitFluentdLogstash

0 likes · 6 min read

Choosing the Right Log Collection Tool: Logstash vs Fluentd, Fluent Bit & Vector

JD Retail Technology

May 23, 2023 · Operations

Analysis of Serverless Scaling Failure Due to Full GC and Sentinel Protection Rules

The article analyzes a serverless scaling failure where newly added instances suffered high CPU and frequent Full GC leading to JVM crashes, reproduces the issue under load, and demonstrates how Sentinel's CPU‑based circuit‑breaker rule mitigates the problem across cold and hot start scenarios.

FullGCJVMOperations

0 likes · 7 min read

Analysis of Serverless Scaling Failure Due to Full GC and Sentinel Protection Rules

360 Tech Engineering

May 23, 2023 · Operations

Data‑Driven Growth: Underlying Logic, Case Studies, and Essential Factors

The article explains how data‑driven thinking replaces traditional money‑burning growth tactics by establishing logical loops, experimental validation, and concrete case studies in acquisition, activation, and targeting, while outlining the essential collaborative factors needed for successful data‑powered operations.

AnalyticsData-DrivenGrowth

0 likes · 10 min read

Data‑Driven Growth: Underlying Logic, Case Studies, and Essential Factors

NetEase Smart Enterprise Tech+

May 23, 2023 · Information Security

How to Seamlessly Migrate and Validate Anti‑Cheat Services Across Environments

This article details the end‑to‑end process of migrating an anti‑cheat service to a new data center, verifying strategy effectiveness, building a real‑sample regression pipeline, and automating integration steps using GoAPI and traffic‑comparison platforms to ensure functional consistency and security.

OperationsRegression testinganti-cheat

0 likes · 9 min read

How to Seamlessly Migrate and Validate Anti‑Cheat Services Across Environments

Tencent Cloud Developer

May 22, 2023 · Artificial Intelligence

Application of AI Large Language Models in the Full Software Development Lifecycle

The article shows how AI large‑language models such as ChatGPT can support every stage of the software development lifecycle—from extracting requirements and designing solutions to generating code, tests, deployment scripts, and operational diagnostics—while warning about model inaccuracies, hallucinations, intellectual‑property and privacy risks.

AIChatGPTDeployment

0 likes · 8 min read

Application of AI Large Language Models in the Full Software Development Lifecycle

Efficient Ops

May 21, 2023 · Operations

From Apollo to Google: How Margaret Hamilton Shaped Modern SRE

This article traces the origins of Site Reliability Engineering from Margaret Hamilton’s pioneering work on the Apollo program, through Google’s formal SRE team creation, and highlights the key differences between SRE and traditional operations practices.

GoogleMargaret HamiltonOperations

0 likes · 7 min read

From Apollo to Google: How Margaret Hamilton Shaped Modern SRE

Efficient Ops

May 17, 2023 · Operations

How JD Built a Scalable H5 Observability Platform to Boost Performance and Reduce Costs

This article details JD's end‑to‑end H5 observability solution, covering the challenges of hybrid app development, the design of a three‑stage UEM platform, deep active and passive monitoring, automated quality gates, and real‑world case studies that demonstrate cost savings and performance improvements.

FrontendH5Hybrid App

0 likes · 15 min read

How JD Built a Scalable H5 Observability Platform to Boost Performance and Reduce Costs

MaGe Linux Operations

May 17, 2023 · Operations

How to Prevent Split‑Brain in High‑Availability Clusters: Practical Strategies & Scripts

This guide outlines practical methods to avoid split‑brain incidents in production HA clusters, including dual‑cable heartbeats, arbitration mechanisms, monitoring alerts, disk‑locking techniques, and a detailed keepalived‑based script for automated detection and recovery.

High AvailabilityKeepalivedOperations

0 likes · 10 min read

How to Prevent Split‑Brain in High‑Availability Clusters: Practical Strategies & Scripts

Wukong Talks Architecture

May 17, 2023 · Operations

Common Production Faults and Their Handling Guide

This guide outlines the most common production failures—including network, server, database, software, security, storage, configuration, and third‑party service issues—and provides detailed steps for detecting, diagnosing, and resolving each type to maintain system stability and reliability.

OperationsProductionfault handling

0 likes · 30 min read

Common Production Faults and Their Handling Guide

php Courses

May 17, 2023 · Operations

Scene Management in RunnerGo: Overview and Usage

This article explains RunnerGo's scene management module, covering the interface between left‑hand navigation and the main scene area, how to create and link interfaces and controllers into executable business scenarios, configure scene settings, debug scenes, and manage test case sets, with links to the project's repositories.

OperationsRunnerGoScene Management

0 likes · 6 min read

Scene Management in RunnerGo: Overview and Usage

Laravel Tech Community

May 16, 2023 · Operations

Linux System Commands Cheat Sheet

This article presents a comprehensive reference of common Linux/Unix command-line utilities covering system information, date handling, shutdown/reboot, file and directory management, searching, mounting, disk usage, user/group administration, permissions, special attributes, compression, package management, backup, and networking, providing a handy guide for system administrators and developers.

LinuxOperationsShell

0 likes · 37 min read

Efficient Ops

May 14, 2023 · Operations

How 17 Leading Chinese Enterprises Excel with DevOps Maturity: 20 Assessment Highlights

This article presents a comprehensive overview of 17 top Chinese enterprises that have passed the DevOps Capability Maturity Model assessment across 20 evaluation items, detailing their project implementations, performance improvements, and the broader significance of the model for digital transformation.

Case StudiesMaturity ModelOperations

0 likes · 12 min read

How 17 Leading Chinese Enterprises Excel with DevOps Maturity: 20 Assessment Highlights

Efficient Ops

May 14, 2023 · Operations

How China’s Telecom Giants Accelerate IT Efficiency with DevOps Maturity Models

Amid a nationwide digital transformation, leading Chinese telecom operators have leveraged the CAICT‑backed DevOps Capability Maturity Model to evaluate and improve their IT performance, integrating team resources and talent to better support business systems, with detailed case studies and measurable outcomes across dozens of projects.

IT efficiencyMaturity ModelOperations

0 likes · 14 min read

How China’s Telecom Giants Accelerate IT Efficiency with DevOps Maturity Models

Huolala Tech

May 11, 2023 · Operations

How Huolala Built a Robust Event‑Tracking Quality System to Boost Data Reliability

This article outlines Huolala's comprehensive approach to event‑tracking data quality, covering goal definition, industry research, a six‑step transformation plan, real‑time monitoring, tool development, and future outlook for their in‑house tracking platform.

Data QualityOperationsPlatform Engineering

0 likes · 9 min read

How Huolala Built a Robust Event‑Tracking Quality System to Boost Data Reliability

Efficient Ops

May 10, 2023 · Operations

How Chinese Banks Accelerate Digital Transformation with DevOps Maturity Models

Amid digital transformation, nine Chinese city‑commercial banks and financial institutions adopted the CAICT‑led DevOps Capability Maturity Model, achieving significant IT efficiency gains, integrating resources, and enhancing business support across continuous delivery, technical operation, security, and system tooling, with detailed project case studies and a comprehensive overview of the standard.

Maturity ModelOperationsbanking

0 likes · 16 min read

How Chinese Banks Accelerate Digital Transformation with DevOps Maturity Models

Efficient Ops

May 10, 2023 · Operations

Mastering XOps: From DevOps to FinOps – A Comprehensive Guide

This article presents a systematic overview of the emerging XOps ecosystem—including DevOps, BizDevOps, AIOps, FinOps, and SRE—detailing their relationships, maturity models, standards, and practical guidance for enterprises seeking to achieve efficient, secure, and data‑driven digital transformation.

AIOpsBizDevOpsFinOps

0 likes · 13 min read

Mastering XOps: From DevOps to FinOps – A Comprehensive Guide

JD Retail Technology

May 10, 2023 · Product Management

Using ChatGPT 4.0 to Boost Product Manager Efficiency: Methods, Prompts, and Case Studies

The article outlines how ChatGPT 4.0 can significantly improve product managers' workflow across research, planning, design, project execution, and iteration by providing prompt engineering techniques, practical examples, and actionable recommendations while emphasizing security and information‑risk considerations.

AI Prompt EngineeringChatGPTOperations

0 likes · 31 min read

Using ChatGPT 4.0 to Boost Product Manager Efficiency: Methods, Prompts, and Case Studies

NetEase Smart Enterprise Tech+

May 10, 2023 · Operations

How to Streamline RTC Audio Issue Troubleshooting: Frameworks, Tools, and Automation

This article explores the challenges of real‑time communication audio problems, outlines their common manifestations and characteristics, and presents a comprehensive troubleshooting framework with standardized processes, automation tools, and perception models to improve efficiency and service quality.

OperationsRTCReal‑time communication

0 likes · 12 min read

How to Streamline RTC Audio Issue Troubleshooting: Frameworks, Tools, and Automation

DevOps

May 8, 2023 · Operations

Key Strategies for Successful Digital Transformation and Overcoming Organizational Resistance

The article outlines why many digital transformation initiatives fail, emphasizes the importance of bottom‑up empowerment over top‑down mandates, and provides practical guidance on building small elite pilot teams, addressing dissent, and sustaining change to achieve long‑term organizational success.

Change ManagementLeadershipOperations

0 likes · 7 min read

Key Strategies for Successful Digital Transformation and Overcoming Organizational Resistance

Liangxu Linux

May 7, 2023 · Cloud Native

Unlock Hidden kubectl Tricks: Advanced Commands for Kubernetes Mastery

This article presents a collection of advanced kubectl techniques—including API inspection, status‑based pod filtering and deletion, node‑specific pod listing, distribution counting, and proxy usage—to help experienced Kubernetes users solve ad‑hoc tasks more efficiently.

CLICommandsKubernetes

0 likes · 7 min read

Unlock Hidden kubectl Tricks: Advanced Commands for Kubernetes Mastery

DevOps Operations Practice

May 5, 2023 · Operations

Monitoring HTTP Services with Prometheus and Blackbox Exporter

This guide explains how to use Prometheus together with the Blackbox Exporter to monitor HTTP applications, covering installation, configuration, Prometheus job setup, Grafana visualization, and alert rule creation for reliable HTTP service observability.

Blackbox ExporterHTTP monitoringOperations

0 likes · 7 min read

Monitoring HTTP Services with Prometheus and Blackbox Exporter

Programmer DD

May 5, 2023 · Operations

Boost Development Efficiency with GitLab CI/CD: A Hands‑On Guide

This article explains why efficiency matters in software delivery, introduces CI/CD concepts and tools like Jenkins and GitLab, details installing GitLab Runner, walks through pipeline configuration with key YAML keywords, and emphasizes that mastering DevOps principles and tools dramatically improves development productivity.

CI/CDContinuous IntegrationGitLab

0 likes · 10 min read

Boost Development Efficiency with GitLab CI/CD: A Hands‑On Guide

Java Architect Essentials

May 4, 2023 · Operations

Easy-Jenkins: A One‑Click Deployment Tool for Vue Front‑Ends and Java JAR Back‑Ends

The article introduces Easy‑Jenkins, a lightweight one‑click deployment tool that supports Vue and JAR projects, explains its pipeline architecture, shows step‑by‑step installation, configuration, branch management, and deployment operations, and provides practical screenshots and command examples for developers.

DeploymentOperationsVue

0 likes · 7 min read

Easy-Jenkins: A One‑Click Deployment Tool for Vue Front‑Ends and Java JAR Back‑Ends

MaGe Linux Operations

May 1, 2023 · Cloud Native

Unlock Hidden kubectl Tricks: Boost Your Kubernetes Workflow

This article shares a collection of practical kubectl commands and tips—including API debugging, pod filtering and deletion, node‑wise pod statistics, and proxy usage—to help Kubernetes users work more efficiently and avoid writing custom client code.

KubernetesOperationsTips

0 likes · 8 min read

Unlock Hidden kubectl Tricks: Boost Your Kubernetes Workflow

MaGe Linux Operations

May 1, 2023 · Operations

How to Keep SSH Sessions Alive: Prevent Timeout on Client and Server

This tutorial explains why SSH connections close, shows how to configure both client and server settings to keep sessions alive, and discusses why permanently disabling timeouts may not be advisable, especially on cloud platforms.

OperationsSSHServer

0 likes · 6 min read

How to Keep SSH Sessions Alive: Prevent Timeout on Client and Server

IT Services Circle

May 1, 2023 · Operations

Understanding Internet Incident Levels and Prevention – The March 29 Tencent Outage

The article explains the classification of internet service incidents into four levels based on severity and impact, illustrates each level with the March 29 Tencent outage, and outlines practical prevention measures such as security defenses, backup plans, monitoring, training, and emergency response.

Incident ManagementOperationsTencent

0 likes · 5 min read

Understanding Internet Incident Levels and Prevention – The March 29 Tencent Outage

DataFunTalk

Apr 29, 2023 · Operations

WeChat NLP Algorithm Microservice Governance: Challenges and Solutions

This article examines the governance of WeChat NLP algorithm microservices, outlining the management, performance, and scheduling challenges they pose, and presents solutions including automated CI/CD pipelines, task‑aware auto‑scaling, DAG‑based service composition, custom Python interpreter PyInter, and an improved Joint‑Idle‑Queue load‑balancing algorithm.

AINLPOperations

0 likes · 13 min read

WeChat NLP Algorithm Microservice Governance: Challenges and Solutions

Efficient Ops

Apr 26, 2023 · Operations

Building a Chaos Engineering Platform for Financial Services: Key Lessons

This talk outlines the challenges of maintaining system stability in fast‑moving, cloud‑native financial services, describes a risk‑identification model, high‑fidelity fault simulation, and a comprehensive stability engineering platform, and shares future plans for automated, data‑driven risk mitigation.

OperationsSREStability Platform

0 likes · 15 min read

Building a Chaos Engineering Platform for Financial Services: Key Lessons

Efficient Ops

Apr 26, 2023 · Operations

How the New DevOps Efficiency Measurement Model Boosts Software Delivery Performance

The article introduces a comprehensive DevOps efficiency measurement model jointly created by CAICT and over 30 leading tech companies, detailing its maturity and tool components, evaluation process, benefits for enterprises, and the upcoming registration timeline for participation.

Maturity ModelOperationsdevops

0 likes · 7 min read

How the New DevOps Efficiency Measurement Model Boosts Software Delivery Performance

Data Thinking Notes

Apr 25, 2023 · Operations

Why Data Quality Matters: A Practical Guide to Governance and Seven‑Dimensional Evaluation

This article explains why data quality is critical for businesses, outlines common data quality problems, their root causes, and presents a comprehensive governance framework—including monitoring rules, alerting, full‑link monitoring, and a seven‑dimensional evaluation model—to ensure high‑quality data delivery.

Big DataData GovernanceData Quality

0 likes · 12 min read

Why Data Quality Matters: A Practical Guide to Governance and Seven‑Dimensional Evaluation

Efficient Ops

Apr 19, 2023 · Operations

How BizDevOps Drives Value Delivery in Cloud-Adapted Banking

The presentation outlines the evolution of lean management, the characteristics and expectations of the cloud era, and the practical implementation of BizDevOps at China Merchants Bank, detailing the 1‑3‑5 framework, goals, capabilities, key practices, and the bank's cloud adaptation strategy.

BizDevOpsCloudDigitalTransformation

0 likes · 16 min read

How BizDevOps Drives Value Delivery in Cloud-Adapted Banking

Ops Development Stories

Apr 19, 2023 · Operations

Mastering Alert Management with Nightingale: Rules, Silencing, Escalation, and Self‑Healing

Learn how to efficiently configure Nightingale’s alert rules, silence unwanted alerts, set up escalation policies, and implement self‑healing scripts using ibex, with step‑by‑step guidance, screenshots, and practical tips for robust monitoring in cloud‑native environments.

NightingaleOperationsibex

0 likes · 11 min read

Mastering Alert Management with Nightingale: Rules, Silencing, Escalation, and Self‑Healing

MaGe Linux Operations

Apr 16, 2023 · Operations

How Netflix’s Telltale Transforms Application Monitoring and Alerting

The article details Netflix’s self‑built Telltale monitoring system, explaining how it consolidates data sources, reduces alert fatigue, provides intelligent alerts, and continuously optimizes application health assessment for over 100 production services, ultimately improving operational efficiency and reliability.

AlertingNetflixOperations

0 likes · 11 min read

How Netflix’s Telltale Transforms Application Monitoring and Alerting

Laravel Tech Community

Apr 13, 2023 · Operations

Guide to Using nssm for Windows Service Management and .NET 6 Web Application Deployment

This article introduces nssm, a Windows service wrapper, explains its features, installation steps, configuration options, common commands, and demonstrates turning a .NET 6 web application into a manageable Windows service with practical command‑line examples.

.NETDeploymentOperations

0 likes · 4 min read

Guide to Using nssm for Windows Service Management and .NET 6 Web Application Deployment

Ops Development Stories

Apr 13, 2023 · Operations

How to Deploy N9e: A Step‑by‑Step Guide to Unified Observability

This article walks through the challenges of observability for small‑to‑medium companies and provides a detailed, hands‑on guide to installing, configuring, and using the N9e monitoring platform—including architecture options, component setup, and adding data sources—so readers can achieve integrated alerting, metrics, logs, and tracing in a single pane.

N9eOperationsmonitoring

0 likes · 13 min read

How to Deploy N9e: A Step‑by‑Step Guide to Unified Observability

Laravel Tech Community

Apr 12, 2023 · Operations

Introduction to Supervisor Process Management Tool and Installation Guide

This article introduces Supervisor, a Python‑based Linux process manager, explains the issues caused by not using a daemon, and provides step‑by‑step instructions for installing, configuring, and enabling Supervisor with example configuration and common control commands.

InstallationLinuxOperations

0 likes · 5 min read

Introduction to Supervisor Process Management Tool and Installation Guide

Ops Development Stories

Apr 12, 2023 · Operations

Essential System Performance Metrics Every Ops Engineer Should Track

This article explains how to categorize and deeply understand key system performance metrics—including infrastructure, application, user experience, and business indicators—so engineers can monitor stability, efficiency, and business impact under high load and concurrency.

Operationsapplication performanceinfrastructure

0 likes · 10 min read

Essential System Performance Metrics Every Ops Engineer Should Track

dbaplus Community

Apr 10, 2023 · Operations

Can Ops Roles Disappear? Exploring Self‑Service Platforms, COE Experts, and SaaS in Modern Monitoring

The article examines whether traditional operations positions can become obsolete by analyzing a self‑service platform + COE + Business Partner model, detailing essential monitoring tools, the role of COE specialists, SaaS alternatives, and practical career pathways for newcomers, mid‑level, and senior engineers.

COECareerOperations

0 likes · 8 min read

Can Ops Roles Disappear? Exploring Self‑Service Platforms, COE Experts, and SaaS in Modern Monitoring

Continuous Delivery 2.0

Apr 10, 2023 · Operations

Five Best Practices for Applying DevOps in Real Projects

This article outlines five practical DevOps best practices—test automation, deployment automation, trunk‑based development, security left‑shift, and loose‑coupled architecture—explaining their importance, implementation tips, and the benefits they bring to continuous delivery and high‑quality software production.

AutomationOperationssoftware engineering

0 likes · 7 min read

Five Best Practices for Applying DevOps in Real Projects

Python Programming Learning Circle

Apr 8, 2023 · Operations

Using Python for Operations Automation: Remote Execution, Log Parsing, Monitoring, Deployment, and Backup

The article explains how operations engineers can leverage Python scripts and popular libraries such as paramiko, regex, psutil, fabric, and shutil to automate common tasks like remote command execution, log analysis, system monitoring with alerts, batch software deployment, and file backup and recovery, providing code examples for each scenario.

AutomationOperationsPython

0 likes · 9 min read

Using Python for Operations Automation: Remote Execution, Log Parsing, Monitoring, Deployment, and Backup

Efficient Ops

Apr 8, 2023 · Operations

South Grid’s CloudYan Platform Wins Top DevOps Maturity Rating – Lessons Learned

At the 20th GOPS Global Operations Conference in Shenzhen, China’s Information and Communication Research Institute announced that South Grid’s Digital Platform Technology (Guangdong) Co., Ltd. achieved excellent ratings for its CloudYan Platform DevOps subsystem, demonstrating how standardized DevOps pipelines and toolchains can dramatically improve software delivery quality, speed, and safety.

Maturity AssessmentOperationscontinuous delivery

0 likes · 12 min read

South Grid’s CloudYan Platform Wins Top DevOps Maturity Rating – Lessons Learned

Efficient Ops

Apr 8, 2023 · Information Security

How China Postal Savings Bank Reached Advanced DevSecOps Maturity – Lessons and Practices

The article details China Postal Savings Bank's successful DevSecOps assessment at the 2023 GOPS Global Operations Conference, sharing the bank's project background, interview insights on culture, processes, and tooling, and outlining the benefits and future plans of adopting standardized DevSecOps practices.

DevSecOpsInformation SecurityMaturity Model

0 likes · 17 min read

How China Postal Savings Bank Reached Advanced DevSecOps Maturity – Lessons and Practices

Efficient Ops

Apr 8, 2023 · Operations

How Guotai Junan Achieved Industry‑Leading DevOps Maturity at GOPS 2023

The article reports on Guotai Junan's successful completion of the CAICT DevOps technical‑operation 2+ assessment at the 20th GOPS Global Operations Conference, detailing the standards, project implementations, interview insights, industry statistics, and the broader DevOps maturity model.

CaseStudyDigitalTransformationOperations

0 likes · 16 min read

How Guotai Junan Achieved Industry‑Leading DevOps Maturity at GOPS 2023

Efficient Ops

Apr 7, 2023 · Operations

Guotai Junan’s Journey to Leading DevOps 2+ Certification – A Case Study

At the 20th GOPS Global Operations Conference in Shenzhen, Guotai Junan’s data center team detailed how their “Central Operations” and “Junhong Junrong” trading projects earned the China Information & Communication Research Institute’s DevOps Technical Operations 2+ level assessment, showcasing the company’s leading digital transformation and smart operations practices.

Case StudyOperationsassessment

0 likes · 17 min read

Guotai Junan’s Journey to Leading DevOps 2+ Certification – A Case Study

Efficient Ops

Apr 7, 2023 · Operations

How South Grid’s Cloud Yan Platform Secured Top DevOps Maturity Scores

The article details South Grid’s successful DevOps maturity assessment at the 20th GOPS Global Operations Conference, highlighting the Cloud Yan platform’s excellent ratings in build‑and‑integration and pipeline modules, and shares insights from a Q&A on the impact of standardized DevOps practices.

Maturity AssessmentOperationscloud platform

0 likes · 12 min read

How South Grid’s Cloud Yan Platform Secured Top DevOps Maturity Scores

Efficient Ops

Apr 7, 2023 · Operations

What Do China’s Latest DevOps & AIOps Maturity Assessments Reveal About Enterprise Success?

China's Information and Communication Research Institute announced the newest evaluation results for its DevOps and AIOps capability maturity models, showing that standardization and tool empowerment have helped over 75 leading enterprises across banking, securities, telecom, and internet sectors improve quality, efficiency, and market competitiveness.

AIOpsEnterpriseMaturity Model

0 likes · 8 min read

What Do China’s Latest DevOps & AIOps Maturity Assessments Reveal About Enterprise Success?

Huolala Tech

Apr 7, 2023 · Operations

How Huolala Built a Scalable Tech Stability System – Key Lessons for Reliability

This article details Huolala's journey in establishing a comprehensive technical stability framework, covering organizational challenges, risk governance, incident response, cultural initiatives, and future automation to enhance system reliability at scale.

OperationsRisk GovernanceSRE

0 likes · 16 min read

How Huolala Built a Scalable Tech Stability System – Key Lessons for Reliability

Continuous Delivery 2.0

Apr 5, 2023 · Operations

Understanding DevOps: Building and Running Software as the Core of Continuous Delivery

The article argues that DevOps is a practical approach to achieving continuous delivery, emphasizing that the teams who build the software must also run it, monitor production, and take responsibility for reliability, especially during unexpected incidents such as off‑hour outages.

Operationsteam collaboration

0 likes · 13 min read

Understanding DevOps: Building and Running Software as the Core of Continuous Delivery

Architecture Digest

Apr 4, 2023 · Operations

Understanding Logs, Their Value, and Practices for Observability and Operations

This article explains what logs are, when to record them, their importance in troubleshooting, performance optimization, security monitoring, and business decisions, and describes how centralized logging, metrics, tracing, and tools like ELK, Prometheus, and OpenTracing enable effective observability in modern distributed systems.

APMOperationsTracing

0 likes · 19 min read

Understanding Logs, Their Value, and Practices for Observability and Operations

Liangxu Linux

Apr 3, 2023 · Operations

How to Quickly Pinpoint High CPU Usage in Java Services: A Step‑by‑Step Guide

When a data platform server suddenly shows CPU usage above 90%, this guide walks you through using Linux tools and a custom script to identify the offending Java process, trace the problematic thread, pinpoint the exact code line, and apply a fix that reduces load dramatically.

CPU profilingJava performanceLinux troubleshooting

0 likes · 6 min read

How to Quickly Pinpoint High CPU Usage in Java Services: A Step‑by‑Step Guide

Architecture Digest

Apr 3, 2023 · Operations

Design and Implementation of a Multi‑Layer Load‑Balancing Platform (VGW)

This article explains the need for reliable load balancing in large‑scale services, analyzes the problems of request distribution and fault isolation, and details the design of a three‑layer and four‑layer load‑balancing architecture—including DNS, Nginx, LVS, FULLNAT, and VGW—along with health‑check, redundancy, and performance optimization techniques.

DPDKFullNATNetwork Architecture

0 likes · 21 min read

Design and Implementation of a Multi‑Layer Load‑Balancing Platform (VGW)

Zhuanzhuan Tech

Mar 29, 2023 · Operations

Design and Implementation of a Warehouse Control System (WCS) for Automated Warehouse Operations

The article details the evolution from a basic inventory system to a full‑featured WMS, introduces a dedicated Warehouse Control System (WCS) architecture, explains the use of HTTP, SSE, WebSocket and TCP protocols for hardware integration, and demonstrates how various automated devices empower inbound, outbound and auxiliary warehouse processes, ultimately improving operational efficiency.

Device IntegrationOperationsWCS

0 likes · 9 min read

Design and Implementation of a Warehouse Control System (WCS) for Automated Warehouse Operations

Efficient Ops

Mar 28, 2023 · Operations

Why SRE Matters: Bridging Product Development and Reliability Engineering

This article explains the role of Site Reliability Engineering (SRE), its responsibilities, how it complements product development, the software lifecycle perspective, and practical approaches to ensure system stability through controllability, observability, and best‑practice implementation.

ObservabilityOperationsReliability Engineering

0 likes · 14 min read

Why SRE Matters: Bridging Product Development and Reliability Engineering

MaGe Linux Operations

Mar 28, 2023 · Operations

Essential Ops Lessons: Prevent Data Disasters and Boost Server Reliability

Drawing from three and a half years of sysadmin experience, this guide shares practical rules for safe online operations, data protection, security hardening, monitoring, performance tuning, and the right mindset to avoid costly outages and maintain stable, secure services.

Operationsbackupmonitoring

0 likes · 12 min read

Essential Ops Lessons: Prevent Data Disasters and Boost Server Reliability

NetEase Yanxuan Technology Product Team

Mar 27, 2023 · Industry Insights

How to Build a Scalable E‑Commerce Supply System: Lessons from Industry Leaders

This article examines the challenges of rapid‑growth e‑commerce supply chains, compares global and domestic supply‑chain software, outlines core SCM concepts, and proposes a framework of design principles, value metrics, and ROI calculations for constructing a flexible, high‑performance supply system.

Industry AnalysisOperationse-commerce

0 likes · 11 min read

How to Build a Scalable E‑Commerce Supply System: Lessons from Industry Leaders

Ops Development Stories

Mar 25, 2023 · Operations

What The Phoenix Project Reveals About Mastering Company Culture and Leadership

Drawing on lessons from the novel The Phoenix Project, this article explains how understanding a company’s vision, mission, structure, culture, and a manager’s personality can boost personal professionalism, improve teamwork, and drive organizational success across any industry.

LeadershipOperationscareer development

0 likes · 7 min read

What The Phoenix Project Reveals About Mastering Company Culture and Leadership

Java Architect Essentials

Mar 24, 2023 · Operations

List of Domestic Software Outsourcing Companies and How to Identify Them

This article compiles a list of Chinese software outsourcing firms, explains how to distinguish outsourcing companies, discusses the pros and cons of working for them, and outlines common outsourcing models and situations where outsourcing may be a suitable career choice.

Career AdviceIT employmentOperations

0 likes · 7 min read

List of Domestic Software Outsourcing Companies and How to Identify Them

MaGe Linux Operations

Mar 24, 2023 · Operations

How to Reduce False Alarms in Distributed Systems with Interval Detection

This article explains the challenges of monitoring highly distributed applications, why static alert thresholds often fail, and how interval detection using algorithms like Local Outlier Factor can improve alert accuracy while reducing noise across tools such as Grafana, Zabbix, and Open‑Falcon.

AlertingOperationsinterval detection

0 likes · 16 min read

How to Reduce False Alarms in Distributed Systems with Interval Detection

MaGe Linux Operations

Mar 24, 2023 · Operations

Why Most Monitoring Strategies Fail and How the CAR Framework Fixes Them

This article explains why typical monitoring approaches miss the mark, outlines four root causes of persistent incidents, and introduces the CAR framework—Customer, Application, Resource—to build user‑centric observability that reduces noise, restores trust, and improves reliability.

CAR frameworkIncident ManagementOperations

0 likes · 11 min read

Why Most Monitoring Strategies Fail and How the CAR Framework Fixes Them

Efficient Ops

Mar 23, 2023 · Operations

How ICBC Transformed Banking with DevOps: A Deep Dive into Operations Excellence

This article examines Industrial and Commercial Bank of China's four‑year DevOps journey, detailing its top‑level design, toolchain integration, end‑to‑end pipelines, team benchmarking, data‑driven management, and coach development, and shows how these practices boosted delivery speed, reduced defects, and supported digital transformation in banking.

Operationsbankingcontinuous delivery

0 likes · 14 min read

How ICBC Transformed Banking with DevOps: A Deep Dive into Operations Excellence

Aikesheng Open Source Community

Mar 23, 2023 · Databases

Step-by-Step Guide to Deleting a Tenant in OceanBase

This article explains the background, environment setup, overall process, and detailed SQL commands required to safely lock, kill sessions, and permanently delete a MySQL tenant in OceanBase, including optional recycle‑bin handling.

Database ManagementOceanBaseOperations

0 likes · 6 min read

Step-by-Step Guide to Deleting a Tenant in OceanBase

Volcano Engine Developer Services

Mar 22, 2023 · Fundamentals

How ByteDance Scales Data Governance: Challenges, Distributed Solutions, and Best Practices

This article examines ByteDance's data governance journey, outlining business, organizational, and cultural challenges, the six-stage evolution framework, real‑world case studies, and the shift from centralized to distributed autonomous governance to improve quality, security, cost, and team efficiency.

Big DataData GovernanceData Quality

0 likes · 18 min read

How ByteDance Scales Data Governance: Challenges, Distributed Solutions, and Best Practices

Alibaba Terminal Technology

Mar 22, 2023 · Operations

How Taobao Scaled IPv6 to 95%+ Users: Lessons from a Mobile PaaS Giant

This article details Taobao's multi‑year journey of deprecating SPDY, boosting IPv6 traffic, building an end‑to‑end IPv6 networking stack, establishing a large‑scale operational system, and sharing technical insights such as APN6, BIERv6, and multi‑connection optimization for future IPv6+ applications.

IPv6OperationsPerformance

0 likes · 23 min read

How Taobao Scaled IPv6 to 95%+ Users: Lessons from a Mobile PaaS Giant

Tencent Architect

Mar 21, 2023 · Operations

Introducing oc-ops: A One‑Stop OS Operations Toolset for Linux Kernel Management

The article presents oc-ops, a unified command‑line toolset for OpenCloudOS that streamlines Linux kernel management by offering standardized syntax, sub‑commands for memory cost analysis, I/O latency monitoring, and IRQ latency detection, along with detailed usage parameters and best‑practice recommendations.

LinuxOperationsSystem Monitoring

0 likes · 12 min read

Introducing oc-ops: A One‑Stop OS Operations Toolset for Linux Kernel Management

dbaplus Community

Mar 20, 2023 · Operations

How Xianyu’s Messaging Team Built a Zero‑Incident System with Gray Releases, Monitoring, and Automated Regression

The article details how Xianyu’s messaging team systematically improved system stability by classifying risks, implementing gray‑release traffic, establishing dedicated monitoring and alerting dashboards, integrating automated regression into CI/CD, and managing strong‑weak dependencies, ultimately reducing online incidents to near zero.

Operationsautomated regressiondependency management

0 likes · 10 min read

How Xianyu’s Messaging Team Built a Zero‑Incident System with Gray Releases, Monitoring, and Automated Regression

Liangxu Linux

Mar 19, 2023 · Operations

Master Log Analysis: Fast Linux Commands to Pinpoint Errors

This guide shows programmers how to quickly locate errors in massive server logs using essential Linux commands such as tail, cat, grep, sed, and pagination tools, providing step‑by‑step examples and tips for efficient debugging.

LinuxOperationsShell Commands

0 likes · 12 min read

Master Log Analysis: Fast Linux Commands to Pinpoint Errors

Sohu Tech Products

Mar 16, 2023 · Operations

Spug: Lightweight Agentless Automation Platform with Docker Deployment Guide

Spug is a lightweight, agentless automation platform for small and medium enterprises that integrates host management, batch execution, online terminal, deployment, scheduling, configuration, monitoring and alerting, and the article provides step‑by‑step Docker and docker‑compose installation instructions to set up the system.

AutomationDockerOperations

0 likes · 4 min read

Spug: Lightweight Agentless Automation Platform with Docker Deployment Guide

Qunhe Technology Quality Tech

Mar 16, 2023 · Operations

Automating IS Regression Testing with SSIM Image Comparison and Async Rendering

This article describes how the Inspiration Spaces (IS) platform implements an automated regression testing pipeline that uses SSIM image similarity, asynchronous rendering, and pre‑defined sample rooms to dramatically reduce manual effort, improve detection of rendering bugs, and streamline cross‑team collaboration.

AutomationOperationsRegression testing

0 likes · 11 min read

Automating IS Regression Testing with SSIM Image Comparison and Async Rendering

Efficient Ops

Mar 15, 2023 · Operations

How Human‑Machine Collaboration Is Redefining Operations with AIOps

The article explores how AIOps, a human‑machine collaborative approach powered by data, algorithms, and contextual knowledge, transforms modern operations by enabling real‑time insight, predictive decision‑making, automated execution, and continuous feedback, especially in complex, security‑sensitive environments like finance.

@DataAIOpsOperations

0 likes · 11 min read

How Human‑Machine Collaboration Is Redefining Operations with AIOps