Tagged articles

Operations

3329 articles · Page 23 of 34

Apr 21, 2020 · Operations

How RPA Transformed Suning’s Workforce Efficiency: 30‑Fold Gains and 10,000+ Automated Processes

This article examines how Suning leveraged Robotic Process Automation to dramatically boost employee productivity, cut repetitive work, and achieve massive efficiency gains across finance, HR, and retail operations, illustrating the broader potential of RPA for modern enterprises.

Digital WorkforceHR automationOperations

0 likes · 6 min read

How RPA Transformed Suning’s Workforce Efficiency: 30‑Fold Gains and 10,000+ Automated Processes

FunTester

Apr 20, 2020 · Operations

Quick‑Start Guide to Arthas: Debugging Java Applications in Minutes

Learn how to install and launch Alibaba’s open‑source Arthas tool, explore its dashboard, run essential commands like thread and watch, and see a practical Java demo, all in a concise step‑by‑step tutorial that gets you debugging Java processes fast.

ArthasJavaOperations

0 likes · 3 min read

Quick‑Start Guide to Arthas: Debugging Java Applications in Minutes

FunTester

Apr 19, 2020 · Operations

How to Load Test Phone Number Binding with Dynamic UID‑Based Numbers

This article walks through the challenges of load‑testing a phone‑binding feature that swaps between two number prefixes while preserving the original UID‑derived number, detailing validation rules, a configurable solution, test design, and the full Groovy‑based load‑test script.

API automationGroovyOperations

0 likes · 7 min read

How to Load Test Phone Number Binding with Dynamic UID‑Based Numbers

DevOps Engineer

Apr 18, 2020 · Operations

Comprehensive Guide to DevOps Terms, Definitions, Implementation, and CI/CD Practices

This article provides a comprehensive overview of DevOps concepts, including definitions of CI, CD, continuous deployment, version control, tooling, implementation steps, benefits, and distinctions from Agile and Lean IT, offering guidance for effective adoption and interview preparation.

CI/CDContinuous IntegrationOperations

0 likes · 15 min read

Comprehensive Guide to DevOps Terms, Definitions, Implementation, and CI/CD Practices

MaGe Linux Operations

Apr 17, 2020 · Operations

Mastering System Monitoring: Goals, Methods, Tools, and Best Practices

This comprehensive guide explains why monitoring is vital for operations, outlines monitoring objectives, methods, core processes, and a detailed overview of open‑source and commercial tools—including Zabbix, Open‑Falcon, and MRTG—while covering metrics, alert handling, and interview preparation for effective system monitoring.

OperationsZabbixsystem metrics

0 likes · 19 min read

Mastering System Monitoring: Goals, Methods, Tools, and Best Practices

Aikesheng Open Source Community

Apr 17, 2020 · Operations

Exploring Database Operations Automation at Tiancheng Financial – Interview with Senior DBA Ge Haoqiang

In this interview, senior DBA Ge Haoqiang from Tiancheng Financial discusses the scale of their database clusters, the efficiency gains from automation, DevOps culture, language choices, and how they adapt to new database features, offering insights into modern database operations automation.

DBADatabase AutomationMySQL

0 likes · 6 min read

Exploring Database Operations Automation at Tiancheng Financial – Interview with Senior DBA Ge Haoqiang

Open Source Linux

Apr 16, 2020 · Operations

Essential Linux Server Hardening: 10 Steps to Optimize After Installation

This guide walks you through ten practical steps—including switching to local yum mirrors, installing key packages, disabling SELinux and the firewall, trimming startup services, tightening SSH settings, syncing time, raising file descriptor limits, and disabling ping—to boost the performance and security of a freshly installed Linux server.

LinuxOperationssecurity

0 likes · 7 min read

Essential Linux Server Hardening: 10 Steps to Optimize After Installation

dbaplus Community

Apr 15, 2020 · Operations

How to Diagnose and Fix a Dual‑Leader ZooKeeper Cluster

This article walks through a real‑world ZooKeeper incident where a five‑node cluster showed two leaders, explains the election rules, analyzes log and configuration mismatches, assesses business impact, and provides a step‑by‑step recovery plan to restore normal service without data loss.

High AvailabilityOperationsTroubleshooting

0 likes · 10 min read

How to Diagnose and Fix a Dual‑Leader ZooKeeper Cluster

FunTester

Apr 14, 2020 · Operations

Spot Performance Problems Without Writing a Single Line of Code

Experienced developers can often identify performance bottlenecks simply by reviewing code implementations, configuration settings such as timeouts, intervals, database and Redis parameters, as well as service monitoring data, container and JVM configurations, allowing them to avoid unnecessary test scripts and code changes.

OperationsOptimizationPerformance

0 likes · 2 min read

Spot Performance Problems Without Writing a Single Line of Code

Liangxu Linux

Apr 13, 2020 · Operations

How to Prevent Accidental Deletion in Linux Shell Scripts

This article explains common Linux shell pitfalls—empty variables, spaces in paths, special characters, and failed cd commands—that can cause accidental file deletion, and provides concrete code examples and best‑practice solutions to avoid such disasters.

OperationsSafety

0 likes · 5 min read

How to Prevent Accidental Deletion in Linux Shell Scripts

Continuous Delivery 2.0

Apr 13, 2020 · Operations

Facebook Configuration Management: Practices, Statistics, and Cultural Insights

This article summarizes Facebook's holistic configuration management practices, presenting cultural influences, storage growth, size distribution, update frequency, change magnitude, and author collaboration statistics, while linking to a series of translated articles that explore tools such as Configerator, GateKeeper, and MobileConfig.

Operationsconfiguration managementlarge-scale systems

0 likes · 10 min read

Facebook Configuration Management: Practices, Statistics, and Cultural Insights

Efficient Ops

Apr 12, 2020 · Operations

Master Incident Management: Definitions, Processes, and Best Practices

This guide explains fault management fundamentals—from ITIL‑based definitions and why it matters, to fault level classification, monitoring, emergency response, recovery, post‑mortem analysis, continuous improvement, and practical advice for practitioners—providing a comprehensive, actionable framework for reliable operations.

ITILIncident ManagementOperations

0 likes · 11 min read

Master Incident Management: Definitions, Processes, and Best Practices

DevOps

Apr 10, 2020 · Operations

Spotify’s Scaled Agile Framework: Organizational Structure and Practices

The article examines Spotify’s scaled agile model, detailing its organizational units—Squads, Tribes, Chapters, and Guilds—along with their characteristics, governance, dependency management, and comparison to other large‑scale agile frameworks such as SAFe, LeSS, and Scrum@Scale.

OperationsScaled AgileSpotify

0 likes · 18 min read

Spotify’s Scaled Agile Framework: Organizational Structure and Practices

DevOps Cloud Academy

Apr 9, 2020 · Operations

Why DevOps Is Essential for Modern IT Operations

The article explains how traditional IT silos hinder rapid incident response, outlines common symptoms of poorly managed applications, and argues that adopting DevOps—supported by cloud‑native infrastructure, automation, and shared responsibility—delivers higher transparency, employee autonomy, operational quality, and customer satisfaction.

AutomationIT CultureOperations

0 likes · 7 min read

Why DevOps Is Essential for Modern IT Operations

High Availability Architecture

Apr 8, 2020 · Operations

Slack's Deployment Process: Balancing Speed and Reliability

This article explains how Slack’s engineering team designs a multi‑stage deployment pipeline—including release branches, staging, dogfood, canary, and percentage rollouts—while emphasizing rapid iteration, visibility, and reliability through fast and atomic deployment mechanisms.

Continuous IntegrationDeploymentOperations

0 likes · 8 min read

Slack's Deployment Process: Balancing Speed and Reliability

Architects Research Society

Apr 5, 2020 · Cloud Native

Lessons from Google, eBay, and Amazon on Large‑Scale Multi‑Language Microservice Architecture

The article examines how Google, eBay, Twitter and Amazon evolved their massive systems into multi‑language microservice ecosystems, highlighting the organic growth of services, incentive‑driven design, standards emergence, service ownership, operational practices, and anti‑patterns for building and scaling cloud‑native architectures.

OperationsService ArchitectureStandardization

0 likes · 20 min read

Lessons from Google, eBay, and Amazon on Large‑Scale Multi‑Language Microservice Architecture

NetEase Game Operations Platform

Apr 4, 2020 · Operations

Understanding Nginx Failure Retry Mechanism and Common Pitfalls

This article explains Nginx's built‑in failure retry mechanism, detailing how fails are defined via proxy_next_upstream, the default and custom error types, retry limits, backup servers, and common pitfalls with configuration examples and practical scenarios.

NginxOperationsbackend

0 likes · 14 min read

Understanding Nginx Failure Retry Mechanism and Common Pitfalls

360 Tech Engineering

Apr 1, 2020 · Operations

Using Supervisor to Manage and Auto‑Restart Python Scripts on Linux

This tutorial explains how to install Supervisor on CentOS and Ubuntu, configure its main and program-specific files, and use it to monitor a continuously running Python script, automatically restarting the script if it crashes or is killed.

AutomationLinuxOperations

0 likes · 5 min read

Using Supervisor to Manage and Auto‑Restart Python Scripts on Linux

DevOps

Apr 1, 2020 · Operations

From Waterfall to Agile: Lessons from the Rapid Construction of Huoshenshan and Leishenshan Hospitals

This article analyzes the rapid construction of Huoshenshan and Leishenshan hospitals, using it as a case study to compare waterfall, spiral, iterative, and Scrum development models, discuss the shift toward agile practices, and outline quality‑built operations principles for complex, time‑critical projects.

Case StudyOperationsdevops

0 likes · 8 min read

From Waterfall to Agile: Lessons from the Rapid Construction of Huoshenshan and Leishenshan Hospitals

Java Captain

Apr 1, 2020 · Operations

Comprehensive Guide to Online Environment Deployment and Operations Practices

This article provides a thorough overview of planning, provisioning, and managing online production environments—including user sizing, bandwidth estimation, database design, OS versus container deployment, middleware selection, security, monitoring, SSH shortcuts, file transfer tools, automation scripts, Docker setup, and log viewing techniques—aimed at giving developers a complete operational perspective.

AutomationDeploymentDocker

0 likes · 16 min read

Comprehensive Guide to Online Environment Deployment and Operations Practices

360 Quality & Efficiency

Mar 31, 2020 · Operations

Using Supervisor for Process Management on Linux: Installation, Configuration, and Practical Example

This article explains why nohup cannot monitor scripts, introduces Supervisor as a Python‑based process monitor, shows how to install it on CentOS, Ubuntu, and via pip, details the supervisord.conf and program .ini configurations, demonstrates a sample Python script, and outlines common commands for managing and restarting services.

AutomationLinuxOperations

0 likes · 6 min read

Using Supervisor for Process Management on Linux: Installation, Configuration, and Practical Example

JD Retail Technology

Mar 31, 2020 · Operations

How Sigma’s Event Management Evolved from Zero to Maturity: Standards, Processes, and Platform Insights

The article outlines the Sigma Quality Management Platform’s event management journey across three development stages—establishing basic standards, expanding processes and channels, and achieving mature, integrated governance—while highlighting current challenges, continuous standard refinement, efficiency gains, and practical implementation details.

Event ManagementOperationsPlatform Development

0 likes · 11 min read

How Sigma’s Event Management Evolved from Zero to Maturity: Standards, Processes, and Platform Insights

DevOps

Mar 30, 2020 · Operations

Efficient Value Stream in the Construction of Huoshenshan and Leishenshan Hospitals: A DevOps Case Study

This article presents a detailed DevOps case study of the Huoshenshan and Leishenshan hospital construction, outlining a ten‑day timeline of parallel and serial value‑streams, highlighting extreme efficiency, short lead times, and a high proportion of value‑added activities across infrastructure, power, communications, medical systems, and IT equipment.

Case StudyHospital ConstructionOperations

0 likes · 6 min read

Efficient Value Stream in the Construction of Huoshenshan and Leishenshan Hospitals: A DevOps Case Study

DevOps Engineer

Mar 29, 2020 · Operations

Top 14 CI/CD Tools and Their Key Features

This article presents a comprehensive overview of the 14 most popular CI/CD tools, describing their main functionalities, licensing models, and official websites to help teams choose the most suitable solution for fast and reliable software delivery.

AutomationCI/CDContinuous Integration

0 likes · 20 min read

Top 14 CI/CD Tools and Their Key Features

Programmer DD

Mar 27, 2020 · Operations

How to Choose Reliable Software Outsourcing Platforms: Global and Domestic Options

This guide reviews the most trustworthy international and Chinese software outsourcing platforms, outlines the essential skills for foreign contracts, highlights key features of each service, and offers practical advice on managing client expectations, expanding your freelance channels, and building a personal brand for long‑term success.

Operationsdomestic platformsfreelance platforms

0 likes · 12 min read

How to Choose Reliable Software Outsourcing Platforms: Global and Domestic Options

Programmer DD

Mar 27, 2020 · Operations

How to Collect Nginx Access and Error Logs with Filebeat, Logstash, and Rsyslog

This guide explains three practical methods for gathering Nginx access and error logs—directly with Filebeat to Elasticsearch, via Filebeat to Logstash then Elasticsearch, and using rsyslog to forward logs to Logstash—complete with configuration snippets and visualization steps in Kibana.

LogstashOperationsfilebeat

0 likes · 9 min read

How to Collect Nginx Access and Error Logs with Filebeat, Logstash, and Rsyslog

Continuous Delivery 2.0

Mar 26, 2020 · Operations

Facebook Configuration Management: Challenges, Design, and Large‑Scale Distribution

The article examines Facebook’s massive, real‑time configuration management system, describing its rapid change frequency, the engineering challenges of configuration sprawl, authoring, validation, dependency handling, and the scalable, reliable distribution mechanisms that keep billions of devices and servers consistently updated.

DeploymentOperationsconfiguration management

0 likes · 10 min read

Facebook Configuration Management: Challenges, Design, and Large‑Scale Distribution

Liangxu Linux

Mar 25, 2020 · Operations

Essential Linux Ops Tricks: xargs, nohup, ps, multitail, netstat & SSH Port Forwarding

This guide walks you through practical Linux command-line techniques for system administrators, covering concise xargs usage, background execution with nohup, process sorting with ps, multi‑log viewing via multitail, network diagnostics using ping, netstat, top‑IP analysis, and secure SSH port forwarding.

Operationsnohupxargs

0 likes · 11 min read

Essential Linux Ops Tricks: xargs, nohup, ps, multitail, netstat & SSH Port Forwarding

Dual-Track Product Journal

Mar 25, 2020 · Product Management

Mastering E‑Commerce Category Design: From Backend Foundations to Frontend Mapping

This article explains how well‑designed product categories serve as the backbone of an e‑commerce platform, covering the concepts of backend and frontend categories, the construction of category trees, and various mapping strategies that help both users find items quickly and operators manage large inventories efficiently.

FrontendMappingOperations

0 likes · 8 min read

Mastering E‑Commerce Category Design: From Backend Foundations to Frontend Mapping

DevOps

Mar 25, 2020 · Operations

DevOps Case Study: ‘Small Team, Big Backend’ Organizational Structure in the Rapid Construction of Huoshenshan and Leishenshan Hospitals

This article reviews the rapid ten‑day construction of Huoshenshan and Leishenshan hospitals from a DevOps perspective, highlighting how a ‘small team, big backend’ organizational model—mirroring agile and networked structures—enabled efficient coordination across multiple industries and swift project delivery.

Case StudyOperationsconstruction

0 likes · 6 min read

DevOps Case Study: ‘Small Team, Big Backend’ Organizational Structure in the Rapid Construction of Huoshenshan and Leishenshan Hospitals

Liangxu Linux

Mar 24, 2020 · Operations

What Do Linux Professionals Actually Do? Exploring Ops and Development Careers

This article breaks down the diverse career paths within Linux, detailing the core responsibilities of operations roles—ensuring stable services and data security—and the various development tracks, from application and embedded programming to low‑level kernel and driver engineering.

CareerOperationssoftware engineering

0 likes · 10 min read

What Do Linux Professionals Actually Do? Exploring Ops and Development Careers

21CTO

Mar 24, 2020 · Operations

Mastering System Resilience: Rate Limiting, Circuit Breaking, and Degradation

To keep systems highly available under sudden traffic spikes, developers employ three core strategies—rate limiting, circuit breaking, and service degradation—each controlling request flow, isolating failures, and gracefully reducing functionality to maintain stability, with practical examples and algorithmic approaches explained.

Operationscircuit breakingrate limiting

0 likes · 5 min read

Mastering System Resilience: Rate Limiting, Circuit Breaking, and Degradation

Efficient Ops

Mar 22, 2020 · Operations

Why Nightingale Is Shaping the Future of Enterprise Monitoring

Nightingale, an open‑source enterprise monitoring platform from Didi, combines cloud‑native design, high availability, flexible plugins, and a powerful object‑tree navigation to meet the monitoring needs of both small clusters and massive deployments, while extending and improving upon Open‑Falcon.

AlertingNightingaleOperations

0 likes · 10 min read

Why Nightingale Is Shaping the Future of Enterprise Monitoring

Architects' Tech Alliance

Mar 22, 2020 · Operations

How to Migrate Legacy Mainframe Workloads to x86: A Step‑by‑Step Guide

This article outlines a comprehensive methodology for migrating small‑mainframe platforms—including hardware assessment, solution design, implementation steps, risk evaluation, and three common data‑migration techniques—so that businesses can safely transition workloads to modern x86 servers while preserving data integrity and service continuity.

Data MigrationLVMOperations

0 likes · 12 min read

How to Migrate Legacy Mainframe Workloads to x86: A Step‑by‑Step Guide

Efficient Ops

Mar 20, 2020 · Operations

How Zhejiang Mobile Revamped IT Operations with AIOpsDev and SRE

Zhejiang Mobile’s IT Operations team announced a strategic shift from reactive ticket‑driven maintenance to a proactive, AI‑powered AIOpsDev model, establishing new departments, adopting SRE practices, and leveraging cloud‑native technologies to dramatically improve efficiency, reliability, and digital transformation.

AIOpsITILOperations

0 likes · 7 min read

How Zhejiang Mobile Revamped IT Operations with AIOpsDev and SRE

MaGe Linux Operations

Mar 19, 2020 · Operations

Mastering Game Operations: From RAID Configurations to Load Balancer Choices

This article explains the fundamentals of operations and game operations, outlines server management strategies for hundreds of machines, compares RAID levels, evaluates load balancers (LVS, Nginx, HAProxy), discusses proxy servers (Squid, Varnish, Nginx), and clarifies middleware, JDK, Tomcat ports, and CDN concepts.

MiddlewareOperationsRAID

0 likes · 8 min read

Mastering Game Operations: From RAID Configurations to Load Balancer Choices

ITPUB

Mar 19, 2020 · Operations

How to Install and Configure Mailx for Automated Log Monitoring on Linux

This guide walks you through installing Mailx on a Linux server, fixing compilation issues, configuring SMTP settings, testing email delivery, and building a keyword‑based log‑monitoring script that triggers alerts via email.

LinuxLog MonitoringOperations

0 likes · 4 min read

How to Install and Configure Mailx for Automated Log Monitoring on Linux

Open Source Linux

Mar 19, 2020 · Operations

Why Is My Server CPU at 99%? Pinpoint Java Thread Bottlenecks Fast

After an alert showed a data platform server’s CPU usage soaring to 98.94%, this article walks through a systematic investigation—from spotting the high‑load process with top, tracing the offending Java thread using pwdx and jstack, to optimizing the time‑conversion utility that caused the overload.

CPUJavaOperations

0 likes · 7 min read

Why Is My Server CPU at 99%? Pinpoint Java Thread Bottlenecks Fast

Open Source Linux

Mar 17, 2020 · Operations

Ultimate Linux Command Cheat Sheet for System Administration

This guide presents a comprehensive cheat sheet of essential Linux commands, covering online queries, file and directory management, content viewing, compression, information display, file searching, user and permission handling, network operations, disk and filesystem tasks, system monitoring, shutdown/reboot procedures, and process management.

LinuxOperationscommand-line

0 likes · 3 min read

Ultimate Linux Command Cheat Sheet for System Administration

DevOps

Mar 16, 2020 · Operations

JD.com DevOps Case Study: Agile Transformation, Continuous Delivery, and Organizational Practices

This case study examines JD.com’s evolution into a technology‑driven enterprise, detailing its corporate culture, the “ABCDE” technology strategy, the implementation of DevOps and agile practices through the CALMS framework, and how unified continuous‑delivery platforms and operational metrics have driven growth, efficiency, and pandemic response.

Big DataJD.comOperations

0 likes · 16 min read

JD.com DevOps Case Study: Agile Transformation, Continuous Delivery, and Organizational Practices

FunTester

Mar 14, 2020 · Operations

Why Load Testing Is Essential for Every CI Pipeline

Load testing, which simulates thousands of real users, is crucial for uncovering performance bottlenecks that functional tests miss, and integrating automated load tests into every CI cycle helps prevent crashes, protect revenue, and ensure reliable software delivery.

CI/CDJenkinsOperations

0 likes · 5 min read

Why Load Testing Is Essential for Every CI Pipeline

Efficient Ops

Mar 11, 2020 · Operations

How to Elevate Your Monitoring System: Proven Practices from Top DevOps Models

This article explains why modern services depend on highly available, scalable monitoring, outlines a systematic way to assess and improve monitoring capabilities using open‑source tools and the DevOps Capability Maturity Model, and details concrete improvement points across data collection, management, and application.

ObservabilityOperationsdevops

0 likes · 9 min read

How to Elevate Your Monitoring System: Proven Practices from Top DevOps Models

Efficient Ops

Mar 10, 2020 · Operations

How to Build Anti‑Fragile Operations in the Cloud Era

This article explains the anti‑fragility concept, illustrates how cloud‑based systems become increasingly vulnerable to unexpected events, and offers practical strategies—including risk reduction, choice diversification, proactive experimentation, and biologically inspired resilience—to transform operations and turn shocks into opportunities.

Anti-FragilityCloud ComputingOperations

0 likes · 19 min read

How to Build Anti‑Fragile Operations in the Cloud Era

Youku Technology

Mar 10, 2020 · Operations

Big Drama Quality Assurance Process at Alibaba Entertainment

Alibaba Entertainment’s Big Drama Assurance framework applies automated and manual quality checks across production, operations, rights, playback, online monitoring, and emergency response, using a unified platform that detects and resolves issues before and after launch to protect revenue, uphold paid‑member rights, and ensure a seamless viewing experience.

AlibabaOperationscontent assurance

0 likes · 7 min read

Big Drama Quality Assurance Process at Alibaba Entertainment

Efficient Ops

Mar 8, 2020 · Operations

How We Scaled a Live‑Streaming Platform from 10K to 1M Concurrent Users in 3 Days

This article recounts how a pandemic‑era live‑streaming service rapidly expanded from ten‑thousand to one‑million concurrent viewers within three days by analyzing the pre‑deployment assessment, container‑based scaling, monitoring, emergency response plans, and post‑launch optimizations.

Live StreamingOperationscloud-native

0 likes · 11 min read

How We Scaled a Live‑Streaming Platform from 10K to 1M Concurrent Users in 3 Days

Big Data Technology Architecture

Mar 7, 2020 · Operations

How to Perform a Graceful Shutdown of an Elasticsearch Node

This article outlines a step‑by‑step procedure for safely taking an Elasticsearch node offline—checking master‑eligible settings, adjusting minimum_master_nodes, excluding the node from routing, waiting for shard relocation, stopping the service, and restoring the cluster routing—ensuring no data loss or service interruption.

ElasticsearchGraceful ShutdownOperations

0 likes · 6 min read

How to Perform a Graceful Shutdown of an Elasticsearch Node

Continuous Delivery 2.0

Mar 6, 2020 · Operations

Google Incident Postmortem Checklist

The article presents a detailed Google‑derived post‑mortem checklist covering event data collection, root‑cause analysis, lessons learned, actionable improvement items, and review procedures to ensure systematic, non‑blame‑focused incident handling.

Incident ManagementOperationsRoot Cause Analysis

0 likes · 5 min read

Liangxu Linux

Mar 5, 2020 · Operations

Essential Linux Commands Every Engineer Should Master

This guide compiles the most essential Linux commands for directory handling, file manipulation, text processing, compression, system monitoring, networking, and routine administration, providing concise examples and practical tips to help beginners and seasoned users alike navigate and manage Unix-like environments efficiently.

OperationsUnixcommand-line

0 likes · 13 min read

Essential Linux Commands Every Engineer Should Master

MaGe Linux Operations

Mar 5, 2020 · Operations

How to Build a Service Tree in CMDB for Efficient Resource Management

This guide explains the concept, design, and step‑by‑step implementation of a service tree in a CMDB, showing how to map departments, products, and services to resources for better visibility, cost tracking, and operational control.

CMDBIT infrastructureOperations

0 likes · 5 min read

How to Build a Service Tree in CMDB for Efficient Resource Management

Java Backend Technology

Mar 5, 2020 · Operations

How a Massive Delete-Database Crisis at Weimeng Reveals Key Ops Lessons

On Feb 23, Weimeng suffered a large‑scale system outage caused by a core operations staff mistakenly deleting production databases, prompting a multi‑day recovery effort with Tencent Cloud support; the article examines the incident’s background, historical parallels, crisis response, and broader operational insights for DevOps and reliability engineering.

Database RecoveryOperationscrisis management

0 likes · 16 min read

How a Massive Delete-Database Crisis at Weimeng Reveals Key Ops Lessons

Youku Technology

Mar 4, 2020 · Operations

Youku Playback Testing Platform: Unified Automation Framework, Services, and System Design

Youku’s unified playback testing platform consolidates a modular automation framework, a comprehensive service chain, and a layered platform ecosystem to standardize workflows, support multiple device types, and provide transparent, real‑time monitoring, thereby reducing development complexity and paving the way for intelligent case recommendation and dynamic verification.

Operationsautomation frameworkplayback testing

0 likes · 11 min read

Youku Playback Testing Platform: Unified Automation Framework, Services, and System Design

Top Architect

Mar 3, 2020 · Databases

MySQL Performance Tuning Tools: mysqltuner.pl, tuning-primer.sh, pt-variable-advisor, and pt-query-digest

This article introduces several MySQL performance‑tuning utilities—including mysqltuner.pl, tuning‑primer.sh, pt‑variable‑advisor, and pt‑query‑digest—explains how to download, install, run them, and interpret their reports to identify configuration issues and optimize database performance.

Database ToolsMySQLOperations

0 likes · 9 min read

MySQL Performance Tuning Tools: mysqltuner.pl, tuning-primer.sh, pt-variable-advisor, and pt-query-digest

Liangxu Linux

Mar 2, 2020 · Operations

Master Linux CLI with kmdr: Interactive Command Explanation Tool

This article introduces the free, open‑source kmdr CLI tool, explains how to install it with Node.js or use the web demo, and demonstrates its ability to break down complex Linux commands into readable modules, covering a wide range of common utilities.

CLICommand-line toolsNode.js

0 likes · 8 min read

Master Linux CLI with kmdr: Interactive Command Explanation Tool

Liangxu Linux

Mar 2, 2020 · Operations

Master Linux Terminal: Fix Common Command Errors and Essential Shortcuts

This guide explains typical Linux terminal pitfalls such as incomplete commands, filename typos, and wrong directories, and provides practical shortcuts like tab completion, history navigation, and quick command substitution to boost productivity for developers and system operators.

OperationsShortcutsTerminal

0 likes · 6 min read

Master Linux Terminal: Fix Common Command Errors and Essential Shortcuts

dbaplus Community

Mar 2, 2020 · Operations

How Jiangsu Mobile Built a Billion‑Call Real‑Time Monitoring Platform with Prometheus

Facing the explosion of 5G traffic and billions of daily call records, Jiangsu Mobile’s IT operations team adopted Prometheus as the core time‑series database, designing a high‑availability, low‑latency monitoring platform that captures, stores, visualizes and predicts performance metrics across their massive billing system.

5GOperationsPrometheus

0 likes · 9 min read

How Jiangsu Mobile Built a Billion‑Call Real‑Time Monitoring Platform with Prometheus

Efficient Ops

Mar 2, 2020 · Databases

Why Did the Weimob Data Deletion Take So Long? A Deep Dive into Database Recovery Challenges

The article analyzes the recent Weimob data‑deletion incident, explaining why recovery is complex, comparing on‑premise, hybrid, and full‑cloud database architectures, and outlining the technical steps and obstacles involved in restoring massive lost data.

Cloud ComputingDatabase RecoveryDisaster Recovery

0 likes · 11 min read

Why Did the Weimob Data Deletion Take So Long? A Deep Dive into Database Recovery Challenges

DevOps Cloud Academy

Feb 28, 2020 · Operations

How the Mall Team Leveraged a DevOps Toolchain for Remote Development During the Pandemic

During the COVID‑19 pandemic, the mall development team adopted a comprehensive DevOps toolchain—including JIRA, GitLab, Jenkins, Sonar, Docker, and Wiki—to enable end‑to‑end remote development, automated pipelines, and continuous delivery, resulting in improved efficiency, reliable releases, and seamless collaboration.

CI/CDDockerJenkins

0 likes · 8 min read

How the Mall Team Leveraged a DevOps Toolchain for Remote Development During the Pandemic

DevOps Cloud Academy

Feb 27, 2020 · Operations

Jenkins Infrastructure, Project Management, and Configuration‑as‑Code Overview

This article introduces Jenkins infrastructure setup, including installation via Ansible, Puppet, Chef or Docker, outlines management tools such as CLI, REST API, python‑jenkins and Jenkins‑client, describes project creation plugins like Job DSL, Job Builder and Jenkinsfile, and explains system configuration using Groovy scripts and the Configuration‑as‑Code plugin.

CI/CDJenkinsOperations

0 likes · 3 min read

Jenkins Infrastructure, Project Management, and Configuration‑as‑Code Overview

Efficient Ops

Feb 26, 2020 · Operations

What the Weimeng Delete‑Database Outage Teaches About Modern Ops

After a core operations staff accidentally deleted Weimeng’s production database in February, the platform endured a multi‑day outage, prompting a transparent crisis response, extensive Tencent Cloud support, and a deep analysis of recovery challenges, operational best practices, and the broader lessons for modern DevOps teams.

Database RecoveryOperationscrisis management

0 likes · 15 min read

What the Weimeng Delete‑Database Outage Teaches About Modern Ops

ITPUB

Feb 26, 2020 · Information Security

What We Learned from the Weimeng Data Deletion Disaster: Backup and Permission Strategies

The article analyzes the recent Weimeng database deletion incident, explains why recovery took 36 hours, and provides practical guidance on backup practices, minimal‑privilege management, and cloud‑based disaster recovery to prevent similar data loss in small and large organizations.

Database SecurityInformation SecurityOperations

0 likes · 9 min read

What We Learned from the Weimeng Data Deletion Disaster: Backup and Permission Strategies

21CTO

Feb 25, 2020 · Operations

Inside the Massive SaaS Data Deletion: How a Core Engineer Wiped Out Millions

A Chinese SaaS provider suffered a catastrophic data loss when a core operations employee maliciously deleted its production databases, prompting emergency repairs, police involvement, and a multi‑day recovery effort that exposed critical gaps in permission management and backup strategies.

OperationsSaaSbackup

0 likes · 8 min read

Inside the Massive SaaS Data Deletion: How a Core Engineer Wiped Out Millions

Efficient Ops

Feb 24, 2020 · Operations

How to Build an Effective Operations Monitoring Platform: Tools, Design, and Best Practices

This article explains why monitoring is essential for operations, reviews popular monitoring tools such as Cacti, Nagios, Zabbix, Ganglia, Centreon, Prometheus and Grafana, outlines a six‑layer unified monitoring platform architecture, offers selection guidance for different enterprise sizes, and shares evolution lessons from small to large scale deployments.

GrafanaOperationsPrometheus

0 likes · 20 min read

How to Build an Effective Operations Monitoring Platform: Tools, Design, and Best Practices

Big Data Technology Architecture

Feb 24, 2020 · Operations

Evolution and Optimization of JD.com Order Center Elasticsearch Cluster Architecture

This article details how JD.com’s order center migrated its Elasticsearch cluster through multiple architectural stages—initial deployment, isolation, replica tuning, master‑slave adjustments, and real‑time dual‑cluster backup—while addressing data synchronization, scaling, and performance pitfalls to achieve high availability and query stability.

Cluster ArchitectureData synchronizationElasticsearch

0 likes · 13 min read

Evolution and Optimization of JD.com Order Center Elasticsearch Cluster Architecture

Qunar Tech Salon

Feb 24, 2020 · Cloud Computing

QQ’s Full Migration to Tencent Cloud: Architecture, Challenges, and Lessons Learned

The article details how QQ migrated all of its services to Tencent Cloud, describing the business scenarios, migration timeline, technical approaches, challenges such as cost, security, and performance, and the operational and architectural lessons gained from the full cloud transition.

Cloud MigrationOperationsQQ

0 likes · 12 min read

QQ’s Full Migration to Tencent Cloud: Architecture, Challenges, and Lessons Learned

Efficient Ops

Feb 20, 2020 · Operations

How to Monitor Nginx Logs with ELK: From Logstash Config to Kibana Dashboard

This guide walks through setting up ELK to collect, parse, and visualize Nginx access logs, covering Logstash configuration, Grok patterns, GeoIP enrichment, Elasticsearch indexing, and Nginx proxy with basic authentication, enabling real-time log analysis and dashboard creation.

ELKElasticsearchKibana

0 likes · 13 min read

How to Monitor Nginx Logs with ELK: From Logstash Config to Kibana Dashboard

Fighter's World

Feb 20, 2020 · Cloud Computing

How I Grounded My First Year at Alibaba: Building OceanBase Cloud and Real‑Time Monitoring

The author recounts his first year at Alibaba, detailing the transition from a solo newcomer to leading the OceanBase cloud platform, establishing large‑scale real‑time monitoring with JStorm, evolving team processes, and adapting to Alibaba’s fast‑paced, iterative culture.

Alibaba CloudJStormOceanBase

0 likes · 9 min read

How I Grounded My First Year at Alibaba: Building OceanBase Cloud and Real‑Time Monitoring

Qunar Tech Salon

Feb 20, 2020 · Operations

Design and Implementation of Business‑Driven Monitoring Systems at JD Cloud

This article explains why monitoring is essential for operations, outlines the four‑layer monitoring standard (infrastructure, liveliness, performance, business), breaks down functional modules and data flows, and showcases JD Cloud's practical design, alarm‑convergence project, and future AI‑driven observability directions.

CloudJD CloudObservability

0 likes · 12 min read

Design and Implementation of Business‑Driven Monitoring Systems at JD Cloud

DevOps Cloud Academy

Feb 19, 2020 · Operations

Configuring Crowd Authentication for Jenkins (Crowd 3.7.1 / Jenkins 2.220)

This guide walks through setting up Crowd 3.7.1 with Jenkins 2.220 by creating user directories, groups, and applications in Crowd, then configuring Jenkins to use the Crowd2 plugin for global security and login, including screenshots for each step.

CI/CDCrowdJenkins

0 likes · 3 min read

Configuring Crowd Authentication for Jenkins (Crowd 3.7.1 / Jenkins 2.220)

DevOps

Feb 19, 2020 · Operations

Single‑Point Breakthrough in Enterprise DevOps Transformation: JD.com Case Study

The article explains how focusing on a single critical point—such as the deployment stage—can dramatically accelerate an organization’s end‑to‑end DevOps transformation, illustrated with JD.com’s journey from manual releases to an automated, high‑efficiency continuous delivery platform.

JD.comOperationsSingle-Point Breakthrough

0 likes · 11 min read

Single‑Point Breakthrough in Enterprise DevOps Transformation: JD.com Case Study

Didi Tech

Feb 18, 2020 · Operations

Didi's National Carpool Day: Technical Insights into Stability Assurance

Didi's National Carpool Day on Dec 3 2019 attracted 3.1M passengers; stability ensured via six pillars: organized task force, capacity forecasting and rapid container scaling, comprehensive monitoring with fire‑fighting map, robust contingency platform, strict process standards, and coordinated third‑party preparation.

Carpool DayDidiOperations

0 likes · 13 min read

Didi's National Carpool Day: Technical Insights into Stability Assurance

Youzan Coder

Feb 12, 2020 · Operations

How Bond Evolved into a Robust Distributed‑Lock Middleware for Scalable Services

This article chronicles the design, evolution, and performance evaluation of Bond, a distributed‑lock SDK used internally at Youzan, covering its multi‑phase roadmap, storage choices, timeout‑retry strategies, lock‑TTL recommendations, and practical pitfalls for reliable operations.

AerospikeDistributed LockEtcd

0 likes · 20 min read

How Bond Evolved into a Robust Distributed‑Lock Middleware for Scalable Services

HomeTech

Feb 12, 2020 · Operations

Design and Architecture of an IBPM Workflow Platform

This article outlines the design, architecture, and key features of an IBPM workflow platform, detailing its background, core concepts, design principles, extensibility, and future direction for creating a configurable, integrated, and intelligent business process management solution.

BPMOperationsPlatform design

0 likes · 4 min read

Design and Architecture of an IBPM Workflow Platform

MaGe Linux Operations

Feb 8, 2020 · Operations

Why OpenTSDB Is the Ultimate Time‑Series Monitoring Solution for Scalable Operations

This article introduces OpenTSDB, a highly scalable time‑series monitoring system built on HBase, explains its architecture, demonstrates how it solves common monitoring challenges, and shows practical usage examples including data modeling, collector integration, and real‑world deployment insights.

HBaseOpenTSDBOperations

0 likes · 9 min read

Why OpenTSDB Is the Ultimate Time‑Series Monitoring Solution for Scalable Operations

Mafengwo Technology

Feb 8, 2020 · Operations

How a Travel Platform Engineered a Pandemic‑Era Emergency Response: Operations Lessons

During the 2020 Chinese New Year lockdown, a travel platform mobilized its development, product, and operations teams to rapidly build refund systems, coordinate with suppliers, and ensure continuous online services, showcasing a user‑first, cross‑functional emergency strategy that balanced technical delivery with intense customer pressure.

OperationsProduct Managementincident response

0 likes · 13 min read

How a Travel Platform Engineered a Pandemic‑Era Emergency Response: Operations Lessons

Efficient Ops

Feb 5, 2020 · Operations

Balancing Stability and Speed: Google SRE Lessons for Modern Ops Teams

This article examines the inherent tension between operations and development, explains Google’s error‑budget and SLO approach, and shares practical DevOps, on‑call, automation, and talent strategies that help ops teams improve efficiency while maintaining product reliability.

AutomationError BudgetOn-Call

0 likes · 9 min read

Balancing Stability and Speed: Google SRE Lessons for Modern Ops Teams

Alibaba Cloud Developer

Feb 2, 2020 · Backend Development

How Chinese Developers Built a Rapid COVID-19 Travel Query Tool in One Day

In early 2020, a small team of Chinese developers swiftly created a COVID-19 travel companion query tool—designing, coding, and deploying a searchable web service within a single day, then scaling it to millions of users using CDN, static site generation, and cloud storage, while emphasizing data accuracy and rapid response.

COVID-19OperationsRapid Prototyping

0 likes · 11 min read

How Chinese Developers Built a Rapid COVID-19 Travel Query Tool in One Day

Architects' Tech Alliance

Jan 17, 2020 · Fundamentals

Overview of Server Benchmark Standards: TPC and SPEC

The article explains the origins, metrics, and test suites of TPC and SPEC benchmarks, describes their various models for CPU, web, HPC and storage performance, shows how to query official results, and notes a promotional bundle of technical e‑books.

BenchmarkCPUOperations

0 likes · 9 min read

Overview of Server Benchmark Standards: TPC and SPEC

Tencent Tech

Jan 17, 2020 · Cloud Computing

How QQ Tackled Massive Cloud Migration Challenges – Tencent’s Strategy Revealed

Tencent’s QQ service migrated over a million servers to public cloud, detailing comprehensive planning, phased execution, and solutions to security, dependency, disaster recovery, and gray‑scale challenges, while highlighting infrastructure upgrades, database migration, cloud‑native tools, and operational transformations that ensured zero user impact.

Cloud MigrationOperationsQQ

0 likes · 20 min read

How QQ Tackled Massive Cloud Migration Challenges – Tencent’s Strategy Revealed

Efficient Ops

Jan 16, 2020 · Operations

Mastering WAS Memory Overflow: Elegant Strategies for Resolution

This article explains IBM WebSphere Application Server's memory architecture, common causes of Java OutOfMemoryError in WAS, and provides a step‑by‑step guide—including log collection, heap analysis, and preventive measures—to diagnose, resolve, and avoid memory overflow incidents in production environments.

Garbage CollectionOperationsWAS

0 likes · 16 min read

Mastering WAS Memory Overflow: Elegant Strategies for Resolution

DevOps

Jan 15, 2020 · Operations

Building Trust, Respect, and Accountability: The Role of Culture in DevOps Transformation

The article explains how a strong, transparent enterprise culture—characterized by trust, respect, and accountability—is the foundational prerequisite for successful DevOps transformation, illustrating key concepts, cultural barriers, and real‑world case studies that show why cultural change must precede technical adoption.

CultureEnterpriseOperations

0 likes · 10 min read

Building Trust, Respect, and Accountability: The Role of Culture in DevOps Transformation

Efficient Ops

Jan 8, 2020 · Operations

How a Bank Built an Automated Operations Platform and CMDB Middle‑Platform

This article details how Ping An Bank tackled rapid growth and complex regulatory demands by creating an automated operations middle‑platform, designing a CMDB with data‑closure and subscription mechanisms, and implementing orchestration, gray‑scale deployment, and high‑risk detection to achieve resilient, scalable infrastructure management.

AutomationCMDBOperations

0 likes · 21 min read

How a Bank Built an Automated Operations Platform and CMDB Middle‑Platform

macrozheng

Jan 8, 2020 · Operations

How to Set Up Jenkins Automated Deployment for the Mall Project

This guide walks you through preparing scripts, uploading them, making them executable, and creating Jenkins jobs for each module of the multi‑module Mall project to achieve fully automated deployment using free‑style projects and SSH execution.

AutomationCI/CDDeployment

0 likes · 8 min read

How to Set Up Jenkins Automated Deployment for the Mall Project

DevOps

Jan 7, 2020 · Operations

DevOps Planning and Practice in a Large State‑Owned Commercial Bank

This article outlines how a major state‑owned commercial bank designed and implemented a DevOps framework—including goals, architecture, the three main pillars of tools, processes, and standards—and shares practical insights, maturity assessment methods, and Q&A for large‑scale financial institutions.

Maturity AssessmentOperationsbanking

0 likes · 12 min read

DevOps Planning and Practice in a Large State‑Owned Commercial Bank

Java Backend Technology

Jan 7, 2020 · Backend Development

Mastering Retry and Idempotency: Prevent Timeout Failures in High‑Concurrency Systems

This article examines a real‑world group‑buy scenario, explains why timeout‑prone interfaces need robust retry and idempotency handling, distinguishes read and write timeouts, outlines key idempotency practices for services and messages, and introduces Guava‑retrying and Spring‑retry as elegant solutions.

Operationsdistributed systemsretry

0 likes · 13 min read

Mastering Retry and Idempotency: Prevent Timeout Failures in High‑Concurrency Systems

Qunar Tech Salon

Jan 7, 2020 · Operations

Comprehensive Dependency Governance for High‑Availability Backend Systems

This article outlines a systematic approach to dependency governance in high‑traffic backend services, covering service classification, rate limiting, Dubbo, HTTP, database, and message‑queue management to enhance availability, reduce failure impact, and improve overall system stability.

DubboOperationsdependency management

0 likes · 10 min read

Comprehensive Dependency Governance for High‑Availability Backend Systems

Architect's Tech Stack

Jan 4, 2020 · Operations

Understanding CPU Usage Spikes in Java Applications: Causes, Diagnosis, and Real‑World Example

This article examines why Java applications experience high CPU usage, covering infinite loops, frequent Young GC, thread count and state, user‑space versus kernel‑space metrics, and provides a step‑by‑step troubleshooting guide illustrated with a real incident.

CPUGCOperations

0 likes · 9 min read

Understanding CPU Usage Spikes in Java Applications: Causes, Diagnosis, and Real‑World Example

MaGe Linux Operations

Jan 2, 2020 · Operations

Mastering RPM and YUM: Essential Commands for Linux Package Management

This guide explains Linux package naming conventions, how to inspect binary dependencies and installed libraries, and provides a comprehensive collection of RPM and YUM commands—including installation, query, verification, removal, and repository configuration—to help administrators manage software efficiently.

CLILinuxOperations

0 likes · 7 min read

Mastering RPM and YUM: Essential Commands for Linux Package Management

Youku Technology

Jan 2, 2020 · Operations

Quality Assurance and Stability Strategies for Alibaba Double 11 "Cat Night" Live Streaming

The QA team delivered a seamless, globally stable Double 11 “Cat Night” live stream across three apps and dozens of devices by applying client‑ and server‑side stability measures, international latency simulation, IPv6 support, cost‑effective CDN strategies, full‑chain monitoring, and automated asset‑loss safeguards, achieving zero financial loss.

Operationsmobileperformance testing

0 likes · 16 min read

Quality Assurance and Stability Strategies for Alibaba Double 11 "Cat Night" Live Streaming

dbaplus Community

Dec 30, 2019 · Operations

How Alibaba’s ECS Team Built a Scalable SRE System for Massive Cloud Services

This article explains the origins of Site Reliability Engineering (SRE), outlines the responsibilities of SRE teams, and details Alibaba Cloud’s ECS SRE practices—including capacity planning, performance optimization, full‑stack stability governance, automated release pipelines, on‑call processes, and the core principles and mindset that guide modern SRE work.

AutomationCloud ComputingOperations

0 likes · 28 min read

How Alibaba’s ECS Team Built a Scalable SRE System for Massive Cloud Services

Youzan Coder

Dec 30, 2019 · Operations

How to Measure and Improve Project Efficiency: A Practical Guide

This article explains why measurement is essential for management, outlines a step‑by‑step process for collecting and analyzing efficiency metrics, and shows how to turn data‑driven insights into concrete conclusions and actionable improvement plans for software projects.

OperationsProject Managementcontinuous improvement

0 likes · 10 min read

How to Measure and Improve Project Efficiency: A Practical Guide

DevOps Cloud Academy

Dec 30, 2019 · Operations

How to Implement an Effective CI/CD Pipeline

Implementing an effective CI/CD pipeline involves understanding continuous integration, delivery, and deployment, recognizing their benefits such as faster feedback and early error detection, and following key stages—from commit and build to testing and production deployment—while selecting appropriate tools and practices to streamline software delivery.

CI/CDContinuous IntegrationOperations

0 likes · 6 min read

How to Implement an Effective CI/CD Pipeline

Efficient Ops

Dec 28, 2019 · Operations

What the 2019 IT Operations Whitepaper Reveals About Enterprise Ops Trends

The 2019 Enterprise IT Operations Whitepaper, released at the national Operations Conference, systematically examines the definition, value, key capabilities, industry applications, challenges, and future trends of IT operations across telecom, finance, Internet, and manufacturing sectors.

Artificial IntelligenceBig DataIT Operations

0 likes · 6 min read

What the 2019 IT Operations Whitepaper Reveals About Enterprise Ops Trends

G7 EasyFlow Tech Circle

Dec 27, 2019 · Operations

Mastering Incident Reviews: The Three Golden Questions for Real Improvement

This article explains how focusing on three key questions during incident post‑mortems, balancing business speed with system stability, and establishing clear SLOs can turn failures into actionable improvements and better fault‑tolerance strategies.

Incident ManagementOperationsSLO

0 likes · 8 min read

Mastering Incident Reviews: The Three Golden Questions for Real Improvement

Qunar Tech Salon

Dec 27, 2019 · Operations

Qunar Ticket Test‑Environment Governance and Automated Monitoring Framework

This article describes Qunar Ticket’s comprehensive test‑environment governance framework, including the “Mirror‑Inspect” monitoring service, configuration and data synchronization strategies, and automated allocation management, highlighting how these practices reduced environment‑related project delays from up to 20% to below 8%.

Operationsconfiguration managementmonitoring

0 likes · 11 min read

Qunar Ticket Test‑Environment Governance and Automated Monitoring Framework

Efficient Ops

Dec 26, 2019 · Operations

How China Telecom’s DICT Leverages DevOps for Agile Cloud‑Native Development

At the 2019 Operations Conference in Beijing, China Telecom’s R&D leader detailed the company’s transformation journey, the DICT capability center’s strategy, the Biying cloud platform, and the implementation of an integrated DevOps platform that streamlines end‑to‑end software delivery using containers, CI and security automation.

CI/CDChina TelecomOperations

0 likes · 3 min read

How China Telecom’s DICT Leverages DevOps for Agile Cloud‑Native Development

Efficient Ops

Dec 26, 2019 · Operations

Inside Jiangsu Telecom’s Leap to Level‑3 DevOps Continuous Delivery

The article recounts Jiangsu Telecom’s successful Level‑3 DevOps continuous delivery assessment at the 2019 Beijing Operations Conference, highlighting the role of standardization and tooling, sharing interview insights on the intelligent pre‑processing system, and outlining the broader DevOps standard ecosystem in China.

Case StudyOperationsStandardization

0 likes · 10 min read

Inside Jiangsu Telecom’s Leap to Level‑3 DevOps Continuous Delivery

Efficient Ops

Dec 26, 2019 · Operations

How CITIC Bank Achieved Level‑3 DevOps Continuous Delivery: Key Lessons

CITIC Bank’s software development center shares how three flagship projects passed the level‑3 DevOps continuous‑delivery assessment, revealing the role of standardization, tool empowerment, agile practices, and container‑based pipelines in accelerating delivery and boosting team morale.

CITIC BankCase StudyOperations

0 likes · 15 min read

How CITIC Bank Achieved Level‑3 DevOps Continuous Delivery: Key Lessons

Aikesheng Open Source Community

Dec 25, 2019 · Operations

Deploying Thanos for Unified Prometheus Monitoring and Long‑Term Storage

This guide explains the background, key features, architecture, and step‑by‑step deployment of Thanos—including Sidecar, Store, Query, Compact, Bucket, Rule, and Check components—to provide a unified, high‑availability Prometheus monitoring view with unlimited historical data storage using object storage.

DeploymentLong‑term StorageOperations

0 likes · 9 min read

Deploying Thanos for Unified Prometheus Monitoring and Long‑Term Storage

HomeTech

Dec 25, 2019 · Operations

Automation in Brand Advertising Testing and Monitoring to Enhance Efficiency and Quality

This project addresses challenges in brand advertising testing by implementing automated testing, monitoring, and data construction solutions, significantly improving efficiency, reducing manual effort, and enhancing product quality through real-time issue detection and resolution.

AutomationData ConstructionOperations

0 likes · 5 min read

FunTester

Dec 25, 2019 · Industry Insights

Why DevTestOps Is the Next Evolution in DevOps Automation

This article explains the evolution from traditional DevOps to DevTestOps, detailing continuous testing, the benefits of integrating automated testing into DevOps pipelines, practical implementation steps, and why organizations should adopt DevTestOps to enhance software quality and delivery speed.

AutomationContinuousTestingDevTestOps

0 likes · 8 min read

Why DevTestOps Is the Next Evolution in DevOps Automation