Tagged articles
3281 articles
Page 21 of 33
Laravel Tech Community
Laravel Tech Community
Sep 22, 2020 · Databases

Common Redis Latency Issues and How to Diagnose Them

This article explains why Redis latency can suddenly increase—covering high‑complexity commands, large keys, concentrated expirations, memory limits, fork overhead, CPU binding, AOF settings, swap usage, and network saturation—and provides practical diagnostic steps and mitigation techniques.

LatencyOperationsdatabase
0 likes · 17 min read
Common Redis Latency Issues and How to Diagnose Them
58UXD
58UXD
Sep 22, 2020 · Operations

How Flexible Staffing and Digital Transformation Can Revive Post‑Pandemic SMEs

The article explores how small and medium‑sized enterprises can recover from pandemic setbacks by adopting flexible employment models, leveraging digital tools for management and customer insight, and shifting to stronger online promotion while controlling costs and improving resilience.

Business strategyDigital TransformationOperations
0 likes · 9 min read
How Flexible Staffing and Digital Transformation Can Revive Post‑Pandemic SMEs
Alibaba Cloud Native
Alibaba Cloud Native
Sep 21, 2020 · Operations

Why Chaos Engineering Is Essential for Cloud‑Native High Availability

This article explains the need for chaos engineering in modern distributed and cloud‑native systems, outlines the challenges faced by architects, developers, testers and product teams, and provides step‑by‑step guidance on using ChaosBlade and Alibaba's AHAS platform for effective fault‑injection experiments.

Cloud NativeOperationschaos engineering
0 likes · 9 min read
Why Chaos Engineering Is Essential for Cloud‑Native High Availability
High Availability Architecture
High Availability Architecture
Sep 21, 2020 · Operations

Full‑Link Load Testing Practices for iQIYI Payment System

This article describes iQIYI's payment team approach to full‑link load testing, covering background challenges, systematic problem exploration, preparation of test environments, traffic modeling, execution safeguards, practical results, and future plans to improve capacity verification and system reliability.

Load TestingOperationscapacity planning
0 likes · 10 min read
Full‑Link Load Testing Practices for iQIYI Payment System
MaGe Linux Operations
MaGe Linux Operations
Sep 18, 2020 · Operations

Essential Linux Operations Metrics for Effective Monitoring

This guide enumerates the key Linux system metrics—covering CPU, memory, disk, I/O, network, kernel parameters, RAID, SMART, NTP, and process information—that open-falcon agents collect every minute to enable comprehensive operations monitoring and timely issue detection.

MetricsOpen-FalconOperations
0 likes · 12 min read
Essential Linux Operations Metrics for Effective Monitoring
IT Architects Alliance
IT Architects Alliance
Sep 14, 2020 · Operations

Implementation of Service Chain Monitoring and End-to-End Process Monitoring

This article explains how to design and implement service‑chain (APM) monitoring and end‑to‑end process monitoring in distributed systems, covering concepts such as spans and traces, TRACE_ID generation, logging practices, visualisation techniques, and a practical expense‑report use case with code examples.

APMDistributed TracingMicroservices
0 likes · 15 min read
Implementation of Service Chain Monitoring and End-to-End Process Monitoring
Efficient Ops
Efficient Ops
Sep 13, 2020 · Operations

Master Nginx: Reverse Proxy, Load Balancing, and High‑Availability Essentials

This guide explains Nginx’s core concepts—including reverse proxy, load balancing, static‑dynamic separation, common commands, configuration blocks, and high‑availability setup with Keepalived—providing step‑by‑step examples and practical diagrams for reliable web service deployment.

ConfigurationOperationshigh availability
0 likes · 11 min read
Master Nginx: Reverse Proxy, Load Balancing, and High‑Availability Essentials
TAL Education Technology
TAL Education Technology
Sep 10, 2020 · Cloud Native

Accelerating Project Deployment with a Container Platform and Domain Convergence

This article describes how the infrastructure team reduced new project deployment time to under an hour by combining a container platform with domain convergence, detailing the processes, automation pipelines, Kubernetes-based deployment, autoscaling, logging, and security considerations for efficient, cloud‑native operations.

Cloud NativeDeployment AutomationKubernetes
0 likes · 17 min read
Accelerating Project Deployment with a Container Platform and Domain Convergence
Efficient Ops
Efficient Ops
Sep 9, 2020 · Operations

Mastering Incident Management: Core Principles and Practical Methods

This guide outlines essential incident management principles—prioritizing business restoration and timely escalation—followed by detailed methodologies such as restart, isolation, and degradation, and explains role responsibilities, user impact handling, and post‑incident summarization for continuous improvement.

Operationsfault handlingincident management
0 likes · 10 min read
Mastering Incident Management: Core Principles and Practical Methods
Efficient Ops
Efficient Ops
Sep 8, 2020 · Operations

From Firefighting to Arson: Mastering Ops Availability in Three Stages

The article outlines a three‑stage ops maturity model—firefighting, fire prevention, and arson—explains how proactive fault‑injection drills, continuous availability improvements, and aligning technical metrics with business value can transform operations from reactive responders into strategic value creators.

AvailabilityFault InjectionOperations
0 likes · 8 min read
From Firefighting to Arson: Mastering Ops Availability in Three Stages
58UXD
58UXD
Sep 7, 2020 · Operations

Designing a High‑Impact Brand‑Driven Operation Campaign on a Tight Timeline

This article details how, despite limited resources and time, a product team designed and executed the “Part‑time Gold Rush” operation—defining goals, targeting young users, building a memorable brand, applying 5W1H strategy, leveraging AARRR growth tactics, and achieving revenue and traffic targets.

AARRROperationsbrand design
0 likes · 9 min read
Designing a High‑Impact Brand‑Driven Operation Campaign on a Tight Timeline
dbaplus Community
dbaplus Community
Sep 6, 2020 · Operations

Building a High‑Performance Monitoring Alert System with Akka, Dubbo, and Ignite

The article outlines G Bank’s transition from a single‑threaded commercial monitoring solution to a self‑developed, open‑source based alert system that leverages Akka for parallel collection, Apache Dubbo for distributed processing, and Apache Ignite for in‑memory storage, achieving million‑level alert capacity, sub‑100 ms latency, and linear scalability.

AkkaApache DubboApache Ignite
0 likes · 17 min read
Building a High‑Performance Monitoring Alert System with Akka, Dubbo, and Ignite
Efficient Ops
Efficient Ops
Sep 3, 2020 · Operations

What Recent Cloud and Data Center Incidents Reveal About Industry Risks?

A roundup of recent tech news covering a Cisco sabotage case, a London data‑center fire, Linux's 29th anniversary, Gartner's China ICT trends, major cloud investments, Windows 95 milestones, Didi's GPU server launch, Hainan's DNS project, Dell’Oro's market report, executive share reductions, and an upcoming global operations conference.

GPULinuxOperations
0 likes · 10 min read
What Recent Cloud and Data Center Incidents Reveal About Industry Risks?
Efficient Ops
Efficient Ops
Sep 2, 2020 · Operations

Why Consistent Shell Script Standards Matter: A Practical Guide

This guide explains the importance of shell script coding standards, outlines core principles such as correctness, readability, maintainability, and consistency, and provides detailed recommendations on file naming, encoding, line length, indentation, comments, testing, and safe use of commands to improve script quality and reduce maintenance costs.

BashOperationscoding standards
0 likes · 26 min read
Why Consistent Shell Script Standards Matter: A Practical Guide
Architecture Digest
Architecture Digest
Aug 30, 2020 · Cloud Native

Migrating Docker Images, Containers, and Volumes: Practical Techniques

This article explains how to migrate Docker images, containers, and data volumes using save/load, export/import, and backup/restore commands, offering practical steps for offline environments, complex production services, and volume handling while highlighting the limitations of conventional approaches.

Container MigrationOperationsVolume Backup
0 likes · 7 min read
Migrating Docker Images, Containers, and Volumes: Practical Techniques
Tencent Cloud Developer
Tencent Cloud Developer
Aug 28, 2020 · Databases

Automating Data Balancing for ClickHouse Clusters on Tencent Cloud

Tencent Cloud’s managed ClickHouse service now includes an automated data‑balancing feature that, after user authorization and bandwidth configuration, creates migration plans to redistribute tables across new or decommissioned nodes, eliminating manual rebalancing, reducing operational overhead, and ensuring balanced storage during elastic scaling.

Operationsclickhousedata balancing
0 likes · 8 min read
Automating Data Balancing for ClickHouse Clusters on Tencent Cloud
Laravel Tech Community
Laravel Tech Community
Aug 25, 2020 · Operations

NetBox 2.9.1 Release Highlights and New Features

NetBox 2.9.1, an IP address and data center infrastructure management tool built on Django and PostgreSQL, introduces several enhancements including SLAAC address status, nested LAG support, version details on error pages, and a backward‑compatible remote authentication backend parameter.

DCIMDjangoIPAM
0 likes · 2 min read
NetBox 2.9.1 Release Highlights and New Features
Efficient Ops
Efficient Ops
Aug 25, 2020 · Operations

How to Build an Enterprise‑Grade Observability System and Master Incident Response

This article explains how enterprises adopting SRE can design a comprehensive observability platform—covering metrics, logs, and tracing—while also detailing effective incident response, post‑mortem practices, testing, capacity planning, automation tool development, and user‑experience focus to improve overall operational reliability.

OperationsSREcapacity planning
0 likes · 17 min read
How to Build an Enterprise‑Grade Observability System and Master Incident Response
DevOps Cloud Academy
DevOps Cloud Academy
Aug 25, 2020 · Operations

A Simple Four‑Step Process for Prioritizing DevOps Work

This article outlines a practical four‑step process—Define, Scope, Experiment, Analyze—to help DevOps engineers prioritize automation tasks, assess pain points, and align improvements with business value, offering actionable guidance for effective pipeline and workflow optimization.

DevOpsOperationsautomation
0 likes · 6 min read
A Simple Four‑Step Process for Prioritizing DevOps Work
Ops Development Stories
Ops Development Stories
Aug 25, 2020 · Operations

ESrally Guide: Install, Configure, and Benchmark Elasticsearch Performance

ESrally is the official Elasticsearch benchmarking tool; this guide walks through its installation prerequisites, step‑by‑step setup of Python, JDK, and Git, configuration of tracks, cars, pipelines, and challenges, and demonstrates real‑world performance comparisons across Elasticsearch versions and hardware platforms.

BenchmarkingESrallyElasticsearch
0 likes · 16 min read
ESrally Guide: Install, Configure, and Benchmark Elasticsearch Performance
DevOps
DevOps
Aug 25, 2020 · Operations

IDCF Phase 5 DevOps Case Study: Traditional Banking Practice and Lessons Learned

This article details a month‑long DevOps case study conducted by the IDCF team on traditional banking, describing the four guiding principles, the six‑stage workflow from team formation to retrospection, the research findings across major Chinese banks, and the resulting best‑case award and future digital‑transformation discussions.

DevOpsFinTechOperations
0 likes · 7 min read
IDCF Phase 5 DevOps Case Study: Traditional Banking Practice and Lessons Learned
Aikesheng Open Source Community
Aikesheng Open Source Community
Aug 24, 2020 · Operations

Prometheus Data Query Basics and Practical Usage Guide

This article introduces Prometheus' query language PromQL, explains instant and range vector selectors, label matching, offset handling, storage design, common functions and aggregation operators, and provides practical advice for efficient querying and avoiding performance issues.

OperationsPromQLPrometheus
0 likes · 13 min read
Prometheus Data Query Basics and Practical Usage Guide
DevOps Cloud Academy
DevOps Cloud Academy
Aug 22, 2020 · Operations

Common Mistakes in DevOps Implementation and How to Avoid Them

The article outlines ten frequent pitfalls that organizations encounter when adopting DevOps—such as out‑of‑order delivery, misunderstandings of DevOps roles, lack of flexibility, speed over quality, isolated teams, unautomated databases, insufficient incident handling, limited expertise, security neglect, and team fatigue—and provides practical guidance to prevent these errors for more successful DevOps outcomes.

Continuous DeliveryDevOpsOperations
0 likes · 11 min read
Common Mistakes in DevOps Implementation and How to Avoid Them
DevOps Cloud Academy
DevOps Cloud Academy
Aug 20, 2020 · Operations

How DevOps Can Reduce Technical Debt During Cloud Migration

This article explains what technical debt is, why it accumulates in both development and operations, and outlines four DevOps‑driven strategies—including building cross‑functional teams, automation, containerization, and API‑centric design—to identify, track, and repay technical debt while improving cloud migration outcomes.

ContainersDevOpsInfrastructure as Code
0 likes · 10 min read
How DevOps Can Reduce Technical Debt During Cloud Migration
Efficient Ops
Efficient Ops
Aug 19, 2020 · Operations

How End-State‑Oriented Monitoring Transforms Operations and AIOps

This article explains the concept of end‑state‑oriented monitoring, its significance for modern operations, the shortcomings of existing solutions, and a layered design approach that leverages real‑time data, service catalogs, and AI to achieve secure, stable, efficient, and low‑cost operations.

DevOpsOperationsaiops
0 likes · 13 min read
How End-State‑Oriented Monitoring Transforms Operations and AIOps
Senior Brother's Insights
Senior Brother's Insights
Aug 19, 2020 · Operations

Essential Ops Lessons: Avoid Disasters with Backups, Monitoring, and Secure Practices

This guide shares hard‑earned lessons from real‑world server administration, emphasizing careful testing, confirming commands before execution, limiting simultaneous operators, always backing up configurations, protecting data, tightening SSH and firewall security, implementing comprehensive monitoring, and applying disciplined performance‑tuning practices to maintain stable, reliable services.

BackupOperationsSystem Administration
0 likes · 12 min read
Essential Ops Lessons: Avoid Disasters with Backups, Monitoring, and Secure Practices
dbaplus Community
dbaplus Community
Aug 17, 2020 · Operations

Master Server Troubleshooting: Diagnose, Optimize, and Keep Your Backend Stable

This article shares practical experience on backend troubleshooting, outlining common failure types, a step‑by‑step diagnosis workflow, essential tools, and systematic optimization techniques for performance, stability and maintainability, helping engineers quickly stop losses, pinpoint root causes, and implement robust fixes.

BackendOperationsmaintainability
0 likes · 21 min read
Master Server Troubleshooting: Diagnose, Optimize, and Keep Your Backend Stable
Open Source Linux
Open Source Linux
Aug 17, 2020 · Operations

Step-by-Step Guide to Install and Configure Zabbix on CentOS 7

This tutorial walks you through installing Zabbix on CentOS 7, covering prerequisite disabling of SELinux and firewalls, adding repositories, installing server, web, and database components, configuring files, securing MariaDB, starting services, and completing the web‑based setup with language customization.

CentOSInstallationLinux
0 likes · 7 min read
Step-by-Step Guide to Install and Configure Zabbix on CentOS 7
FunTester
FunTester
Aug 15, 2020 · Operations

Why Quality Management Is Critical for Project Success

This article explains the importance of quality management in projects, outlines its two main dimensions—process quality and product quality—details the multiple benefits of systematic quality control, and provides an eight‑step framework for creating an effective quality management plan.

OperationsProject ManagementQA
0 likes · 5 min read
Why Quality Management Is Critical for Project Success
DevOps Cloud Academy
DevOps Cloud Academy
Aug 13, 2020 · Operations

Integrating DevOps Toolchains for Enterprise‑Scale End‑to‑End Communication and Collaboration

The article explains how integrating DevOps toolchains can achieve enterprise‑scale end‑to‑end communication and collaboration without forcing teams to change their workflows, discusses common bottlenecks, presents unified versus loosely‑coupled integration approaches, and offers practical recommendations for building an inclusive, interconnected DevOps ecosystem.

CollaborationDevOpsEnterprise
0 likes · 10 min read
Integrating DevOps Toolchains for Enterprise‑Scale End‑to‑End Communication and Collaboration
DevOps Cloud Academy
DevOps Cloud Academy
Aug 12, 2020 · Operations

10 International Companies That Successfully Transformed to DevOps in 2020

This article reviews ten well‑known enterprises—including Adidas, Capital One, Verizon, Disney, and Starbucks—that have undertaken large‑scale DevOps and cloud‑native transformations, detailing the challenges they faced, the cultural and technical changes implemented, and the measurable business benefits achieved.

Cloud NativeDevOpsDigital Transformation
0 likes · 13 min read
10 International Companies That Successfully Transformed to DevOps in 2020
Efficient Ops
Efficient Ops
Aug 11, 2020 · Operations

How Multi‑Cloud Disaster Recovery Boosts Site Availability: Lessons from Real‑World DR Drills

This article shares a detailed case study of building multi‑cloud site disaster‑recovery and fault‑drill practices at Kaixin Network, covering high‑availability concepts, architectural redesign, pain points, automated one‑click switching, and future self‑healing with chaos engineering to improve reliability.

Operationsdisaster recoveryfault drills
0 likes · 15 min read
How Multi‑Cloud Disaster Recovery Boosts Site Availability: Lessons from Real‑World DR Drills
Java Architect Essentials
Java Architect Essentials
Aug 11, 2020 · Operations

Four Essential Linux Monitoring Tools for Operations Engineers

This article introduces four widely used Linux monitoring tools—iotop, htop, IPTraf, and Monit—explaining their features, usage scenarios, and how they help operations engineers diagnose performance issues without a GUI, including real‑time I/O tracking, visual CPU/memory graphs, network traffic analysis, and flexible alerting.

IPTrafLinuxMonit
0 likes · 7 min read
Four Essential Linux Monitoring Tools for Operations Engineers
IT Architects Alliance
IT Architects Alliance
Aug 6, 2020 · Operations

Eight Essential Steps for Successful Disaster Recovery Drills

This guide outlines eight practical steps—including defining scope, forming a planning team, setting clear objectives, designing realistic scenarios, creating evaluation checklists, assigning roles, conducting pre‑drill briefings, and performing post‑drill reviews—to help organizations execute effective, repeatable disaster recovery exercises that strengthen business continuity.

OperationsPlanningbest practices
0 likes · 9 min read
Eight Essential Steps for Successful Disaster Recovery Drills
StarRing Big Data Open Lab
StarRing Big Data Open Lab
Jul 28, 2020 · Operations

How DevOps and SRE Transform Modern Software Delivery and Operations

This article explains the evolution from traditional C/S to B/S architectures, compares DevOps and SRE principles, discusses their roles in the container and cloud eras, and showcases StarRing's TDC platform that integrates automated pipelines, monitoring, and deployment for efficient software delivery.

ContainerizationDevOpsOperations
0 likes · 14 min read
How DevOps and SRE Transform Modern Software Delivery and Operations
Xianyu Technology
Xianyu Technology
Jul 28, 2020 · Operations

ShenTan: Automated Fault Localization System for Online Services

ShenTan is an automated fault‑localization platform for online services that quickly (under five seconds) pinpoints server‑side issues with developer‑level accuracy by aggregating real‑time metrics, applying a decision‑tree model enriched by expert knowledge and dynamic thresholds, and presenting results through an integrated alert and visualization system, while planning broader endpoint coverage and multi‑tenant support.

Big DataFault LocalizationOperations
0 likes · 12 min read
ShenTan: Automated Fault Localization System for Online Services
IT Architects Alliance
IT Architects Alliance
Jul 27, 2020 · Operations

Why Tape Backup Is Failing and How Disk Backup Can Save Your Data

The article analyzes the growing limitations of tape backup, outlines a step‑by‑step migration to disk‑based backup using deduplication, compression and modern storage technologies, and explains how this transition improves reliability, cost efficiency and recovery speed for enterprises.

BackupData ProtectionOperations
0 likes · 11 min read
Why Tape Backup Is Failing and How Disk Backup Can Save Your Data
Zhongtong Tech
Zhongtong Tech
Jul 25, 2020 · Operations

How ZTO Express Leveraged Technology to Become China’s Logistics Leader

This presentation details ZTO Express’s rapid rise from a modest startup to the world’s largest courier by exploring its technology‑driven business model, crowd‑funded expansion, electronic waybills, smart routing, AI customer service, employee equity schemes, and future digital logistics strategies.

AIBusiness ModelLogistics
0 likes · 27 min read
How ZTO Express Leveraged Technology to Become China’s Logistics Leader
Open Source Linux
Open Source Linux
Jul 23, 2020 · Operations

5 Essential Steps to Become a Successful DevOps Engineer

This article outlines the five key practices—adopting a developer mindset, mastering system engineering, gaining cloud experience, learning containers, and developing soft skills—required to become an effective DevOps engineer in today’s rapidly evolving tech landscape.

ContainersDevOpsOperations
0 likes · 6 min read
5 Essential Steps to Become a Successful DevOps Engineer
dbaplus Community
dbaplus Community
Jul 20, 2020 · Operations

How to Build Reliable Monitoring for Low‑Frequency Financial Services

After two years transitioning from e‑commerce to finance, the team shares practical monitoring strategies for low‑frequency financial services, contrasting e‑commerce traffic‑based methods with finance‑specific challenges, and detailing point‑based metrics, hourly success‑rate alerts, aspect‑oriented exception handling, white‑list filtering, and Sentinel‑based circuit breaking.

AlertingAspect Oriented ProgrammingCircuit Breaking
0 likes · 16 min read
How to Build Reliable Monitoring for Low‑Frequency Financial Services
Swan Home Tech Team
Swan Home Tech Team
Jul 20, 2020 · Backend Development

Design and Evolution of a Reconciliation Center: From Version 1.0 to 3.0

This article introduces the concept, core capabilities, and architectural evolution of a reconciliation center—from its initial 1.0 design through 2.0 and 3.0 upgrades—highlighting problem statements, solution approaches, and the applicable scenarios that make it essential for large‑scale data consistency in modern micro‑service systems.

BackendOperationsReconciliation
0 likes · 14 min read
Design and Evolution of a Reconciliation Center: From Version 1.0 to 3.0
DevOps
DevOps
Jul 20, 2020 · Operations

Bank 4.0 DevOps Case Study: Practices, Challenges, and Solutions in Traditional Banking

This case study analyzes the Bank 4.0 transformation of traditional Chinese banks, detailing industry characteristics, historical challenges, open‑banking drivers, the ABCDII technology framework, DevOps tooling, metric systems, and a future ecosystem vision to guide digital and operational improvement.

AIBankingCloudComputing
0 likes · 18 min read
Bank 4.0 DevOps Case Study: Practices, Challenges, and Solutions in Traditional Banking
Qunhe Technology Quality Tech
Qunhe Technology Quality Tech
Jul 17, 2020 · Operations

How We Built a Robust Monitoring System for Construction Drawing Production

This article describes how our team designed and implemented a comprehensive online monitoring system for construction drawing generation, covering business background, technical architecture analysis, metric definition, monitoring methods, and the resulting dashboards that improve quality, stability, and rapid issue resolution.

MetricsOperationsconstruction drawing
0 likes · 10 min read
How We Built a Robust Monitoring System for Construction Drawing Production
DevOps
DevOps
Jul 17, 2020 · Operations

Agile vs DevOps: Understanding Their Overlap, Differences, and Evolution

This article explores the relationship between Agile and DevOps, explaining their origins, narrow and broad definitions, how they address gaps between business, development, and operations, and presenting a capability growth model that highlights continuous delivery and lean principles as shared goals.

Continuous DeliveryDevOpsLean
0 likes · 9 min read
Agile vs DevOps: Understanding Their Overlap, Differences, and Evolution
Youku Technology
Youku Technology
Jul 16, 2020 · Operations

How Alibaba Entertainment Automates Capacity Management and Elastic Scaling

Alibaba Entertainment transformed its capacity management from manual, experience‑based decisions to a fully automated system that continuously evaluates single‑machine performance, identifies performance and success‑rate breakpoints, and drives elastic scaling, dramatically improving resource utilization, availability, and development efficiency across all its applications.

OperationsPerformance Testingautomation
0 likes · 10 min read
How Alibaba Entertainment Automates Capacity Management and Elastic Scaling
MaGe Linux Operations
MaGe Linux Operations
Jul 14, 2020 · Operations

How Keepalived Enables High-Availability Load Balancing with VRRP

Keepalived, originally designed for LVS load balancing, provides VRRP-based high‑availability by managing LVS nodes, performing health checks, and offering failover for services like Nginx, HAProxy, and MySQL, while also addressing split‑brain scenarios and non‑preemptive configurations.

OperationsVRRPfailover
0 likes · 10 min read
How Keepalived Enables High-Availability Load Balancing with VRRP
Efficient Ops
Efficient Ops
Jul 13, 2020 · Operations

What 13,966 Ops Job Listings Reveal About Salary, Skills, and Hot Cities

This article analyzes 13,966 Chinese operations‑engineer job postings scraped from 51job, cleaning the data with Python and Pandas, then visualizing industry demand, city concentration, salary ranges, education requirements, company size distribution, and keyword trends to guide job seekers and recruiters.

Data visualizationOperationsPython
0 likes · 14 min read
What 13,966 Ops Job Listings Reveal About Salary, Skills, and Hot Cities
Architects Research Society
Architects Research Society
Jul 13, 2020 · Operations

A Digital Transformation Framework for Asset Management: Integrating Business, Culture, and Technology

The article presents a Digital Transformation Framework (DTF) that helps asset‑management firms model, evaluate, and implement disruptive digital strategies across front, middle, and back‑office functions, emphasizing composable enterprises, cultural change, BPaaS, API‑driven architectures, and value‑based prioritization to achieve sustainable competitive advantage.

APIBPaaSOperations
0 likes · 14 min read
A Digital Transformation Framework for Asset Management: Integrating Business, Culture, and Technology
Efficient Ops
Efficient Ops
Jul 12, 2020 · Operations

How Full-Path Packet Loss Monitoring Transforms Network Reliability

This article explains the concept of full‑path packet loss monitoring, its importance for banking networks, the causes of packet loss, and detailed technical implementations—including traffic splitting, collection, automatic analysis engines, TCP retransmission detection, and algorithms for pinpointing loss locations—to dramatically reduce troubleshooting time.

Network MonitoringOperationsPacket Loss
0 likes · 11 min read
How Full-Path Packet Loss Monitoring Transforms Network Reliability
iQIYI Technical Product Team
iQIYI Technical Product Team
Jul 10, 2020 · Operations

iQIYI IPv6 Large‑Scale Deployment: Technical Challenges, Solutions, and Management Practices

iQIYI’s IPv6 rollout, responding to the national deployment plan, coordinated multiple technical teams to redesign its network and introduced the “iQIYI IPv6 Cloud Control” scheme that manages IPv4/IPv6 switching and fallback, reaching more than 200 million active IPv6 users and 800 GB traffic peaks, guided by long‑term strategic value, clear milestones, and engineers’ curiosity to expand IPv6‑driven service quality and cost savings.

IPv6InfrastructureOperations
0 likes · 12 min read
iQIYI IPv6 Large‑Scale Deployment: Technical Challenges, Solutions, and Management Practices
转转QA
转转QA
Jul 9, 2020 · Operations

Testing Scenario Extraction and Tool Selection for Business Operations

The article explains how to isolate testing scenarios and choose appropriate testing methods for various business contexts—storefront changes, order processing, and cross‑platform integrations—by establishing baseline data, comparing results, and leveraging tools like YApi to improve quality and efficiency.

OperationsQATest Strategy
0 likes · 7 min read
Testing Scenario Extraction and Tool Selection for Business Operations
AntTech
AntTech
Jul 2, 2020 · Operations

Innovative Design and Implementation of the Barad‑Dur Custom Monitoring Dashboard

This article introduces the Barad‑Dur custom monitoring dashboard of Ant Monitoring, detailing its WYSIWYG editor, advanced interaction features, controller concept, extensible data‑source architecture, unified time‑series format, scene‑graph inspired layout engine, and future roadmap for cloud‑native observability.

DashboardDataSourceOperations
0 likes · 12 min read
Innovative Design and Implementation of the Barad‑Dur Custom Monitoring Dashboard
DevOps Cloud Academy
DevOps Cloud Academy
Jul 2, 2020 · Operations

Design and Extension of DevOps Platform Tasks Based on Jenkins Pipeline

This article explains how the PuYuan DevOps platform extends Jenkins pipeline tasks by categorizing atomic tasks, designing flexible database schemas for task templates and attributes, and implementing container-based environment isolation to support scalable, secure continuous integration and deployment across diverse enterprise environments.

ContainerizationDevOpsJenkins
0 likes · 10 min read
Design and Extension of DevOps Platform Tasks Based on Jenkins Pipeline
Big Data Technology & Architecture
Big Data Technology & Architecture
Jun 29, 2020 · Operations

Meizu's Automation Journey and Continuous Delivery Platform Evolution

The article outlines Meizu's transition from a music‑player company to a mobile and internet service provider, detailing the operational challenges faced across three internet eras, the development of a comprehensive automation and continuous delivery platform, and the role of big‑data‑driven insights in improving quality, efficiency, cost, and security.

Continuous DeliveryDevOpsOperations
0 likes · 14 min read
Meizu's Automation Journey and Continuous Delivery Platform Evolution
DevOps Coach
DevOps Coach
Jun 29, 2020 · Operations

How China’s DevOps Community Chose the Best SaaS Platform: GitLab vs Jira vs CODING

The Chinese DevOps community evaluated three SaaS platforms—GitLab (free), Jira Cloud (free), and CODING (Tencent Cloud DevOps)—against requirements such as private repository collaboration, OKR management, Scrum planning, CI/CD pipelines, artifact storage, and cloud deployment, ultimately concluding that CODING offers the most suitable integrated solution.

DevOpsGitLabOperations
0 likes · 15 min read
How China’s DevOps Community Chose the Best SaaS Platform: GitLab vs Jira vs CODING
Dual-Track Product Journal
Dual-Track Product Journal
Jun 27, 2020 · Operations

How Modern Procurement Management Systems Streamline Supply Chains

This article explains what procurement is, its strategic importance, the components and architecture of a procurement management system, detailed workflow steps, and key functions such as item management, order handling, pricing maintenance, and supplier returns, highlighting how effective procurement reduces costs and boosts competitiveness.

OperationsSupply Chaininventory
0 likes · 13 min read
How Modern Procurement Management Systems Streamline Supply Chains
DevOps Cloud Academy
DevOps Cloud Academy
Jun 27, 2020 · Operations

Linux Service and Process Management with Nginx

This guide explains how to install Nginx on a Linux server, manage it with systemctl commands, verify its operation using netstat, and control related processes via ps and kill utilities, providing practical command examples for each step.

LinuxOperationsService
0 likes · 3 min read
Linux Service and Process Management with Nginx
DevOps Cloud Academy
DevOps Cloud Academy
Jun 26, 2020 · Operations

Linux System User and Group Management Tutorial

This tutorial explains Linux user and group management, covering login prompts, user information commands, adding, modifying, and deleting users, switching users, password handling, file permission changes, and group administration with practical command examples and code snippets.

LinuxOperationsShell Commands
0 likes · 7 min read
Linux System User and Group Management Tutorial
DevOps Cloud Academy
DevOps Cloud Academy
Jun 26, 2020 · Operations

Linux File and Directory Permission Management Tutorial

This tutorial explains Linux file and directory permission management, covering permission categories, how to view, add, revoke, and recursively apply permissions using commands such as ls, chmod, and demonstrates permission notation with examples.

File PermissionsLinuxOperations
0 likes · 3 min read
Linux File and Directory Permission Management Tutorial
Qunar Tech Salon
Qunar Tech Salon
Jun 23, 2020 · Operations

A Simple Gray Release Solution for High‑Concurrency Flight Ticket Systems

This article presents a lightweight gray release approach for complex flight ticket services, comparing traditional hardware and soft‑routing isolation methods, describing the authors' traffic‑based gray identification, business‑focused monitoring, implementation details, and automated safeguards to enable safe incremental deployments.

BackendDeploymentOperations
0 likes · 8 min read
A Simple Gray Release Solution for High‑Concurrency Flight Ticket Systems
Suning Technology
Suning Technology
Jun 22, 2020 · Operations

How Suning Moved 26,888 Servers in 75 Days – Key Takeaways

Suning’s data center team completed a record-breaking migration of 26,888 servers across 75 days, detailing the planning, tight time windows, intensive communication, cross‑team coordination, risk management, and efficiency gains that enabled zero‑downtime migration and significant cost savings for future operations.

InfrastructureOperationscloud computing
0 likes · 7 min read
How Suning Moved 26,888 Servers in 75 Days – Key Takeaways
JD Retail Technology
JD Retail Technology
Jun 17, 2020 · Operations

How JD’s Data Platforms Scaled for the 618 Mega‑Sale: Operations, Stress‑Testing, and Dual‑Stream Architecture

The article details JD’s data product teams’ systematic preparation for the 618 shopping festival, covering pressure estimation, capacity expansion, stress testing, emergency downgrade strategies, dual‑data‑center isolation, high‑fidelity end‑to‑end testing, and continuous monitoring to ensure stable, real‑time data services during massive traffic spikes.

Big DataData PlatformJD.com
0 likes · 10 min read
How JD’s Data Platforms Scaled for the 618 Mega‑Sale: Operations, Stress‑Testing, and Dual‑Stream Architecture
Qunar Tech Salon
Qunar Tech Salon
Jun 16, 2020 · Operations

Qunar's Multi-IDC Deployment and Fault Self‑Healing Architecture

This article describes how Qunar scaled its IDC infrastructure, introduced multi‑IDC deployment, automated DNS‑based load balancing, open‑source DNSDB, and an IDC proxy built on Squid to achieve rapid fault self‑healing and transparent traffic switching for both user and third‑party access.

DNSOperationsProxy
0 likes · 8 min read
Qunar's Multi-IDC Deployment and Fault Self‑Healing Architecture
Efficient Ops
Efficient Ops
Jun 15, 2020 · Operations

Which Monitoring Approach Truly Delivers End-to-End Business Performance Insight?

This article examines why traditional network‑centric NPMD tools, agent‑based APM solutions, and their combination fall short of true end‑to‑end business performance monitoring, and argues that Business Performance Monitoring (BPM) using passive traffic mirroring offers the most complete, non‑intrusive full‑link visibility for application operations.

APMBPMFull‑Link Monitoring
0 likes · 9 min read
Which Monitoring Approach Truly Delivers End-to-End Business Performance Insight?
Full-Stack DevOps & Kubernetes
Full-Stack DevOps & Kubernetes
Jun 15, 2020 · Cloud Native

How to Diagnose and Fix Common Kubernetes Pod Issues

This guide walks through systematic Kubernetes troubleshooting steps for pods stuck in Pending, Waiting, CrashLoopBackOff, or Running incorrectly, and also covers controller, service, and network debugging using kubectl commands, log inspection, validation flags, and endpoint verification.

Cloud NativeKubernetesOperations
0 likes · 9 min read
How to Diagnose and Fix Common Kubernetes Pod Issues
JD Retail Technology
JD Retail Technology
Jun 15, 2020 · Operations

JD Digital Technology's 6.18 Promotion Technical Preparation Overview

The article details JD Digital Technology's comprehensive technical preparation for the 6.18 shopping festival, describing how 13 cross‑functional teams performed system scaling, multi‑layer performance testing, architecture upgrades, risk management, data pipeline enhancements, and 24/7 monitoring to ensure a stable, high‑throughput promotion environment.

6.18BackendJD
0 likes · 16 min read
JD Digital Technology's 6.18 Promotion Technical Preparation Overview