Tagged articles
125 articles
Page 2 of 2
MaGe Linux Operations
MaGe Linux Operations
Dec 30, 2018 · Operations

Step‑by‑Step Guide to Building an ELK Stack on CentOS 6.7

This tutorial walks you through setting up Java, ElasticSearch 2.1.0, Logstash 2.1.1, Kibana 4.3.1, and NGINX on a CentOS 6.7 server, configuring each component, linking them together, and troubleshooting common time‑zone issues so you can visualize logs with Kibana.

CentOSELKElasticsearch
0 likes · 8 min read
Step‑by‑Step Guide to Building an ELK Stack on CentOS 6.7
dbaplus Community
dbaplus Community
Dec 11, 2018 · Databases

How We Fixed MongoDB Outages and Boosted Performance in Production

This article outlines MongoDB's key features, describes a real‑world outage caused by misconfigured connection limits, details the root‑cause analysis and temporary remediation, and presents a comprehensive set of configuration, sharding, and hardware optimizations that dramatically improved the system's reliability and throughput.

ConfigurationMongoDBOps
0 likes · 14 min read
How We Fixed MongoDB Outages and Boosted Performance in Production
dbaplus Community
dbaplus Community
Dec 10, 2018 · Databases

How to Run Percona MongoDB HotBackup with a Simple PHP Script

This guide explains why the community edition of MongoDB lacks native hot backup, how Percona MongoDB adds online backup support, the underlying backup and restore principles, and provides a step‑by‑step PHP script with environment setup, configuration, execution, and scheduling instructions.

Database BackupHotBackupMongoDB
0 likes · 7 min read
How to Run Percona MongoDB HotBackup with a Simple PHP Script
37 Interactive Technology Team
37 Interactive Technology Team
May 25, 2018 · Operations

Optimization and Redesign of Open-Falcon Monitoring System for the 37 Monitoring Platform

The project redesigns the Open‑Falcon monitoring system for the 37 platform by integrating it with the existing CMDB, adding distributed‑lock high‑availability for judge and alarm modules, optimizing cross‑region agent data transmission, fixing timezone inconsistencies, and enabling redundant query/graph services, thereby unifying disparate monitoring tools into a scalable, reliable solution.

CMDBOpen-FalconOps
0 likes · 11 min read
Optimization and Redesign of Open-Falcon Monitoring System for the 37 Monitoring Platform
dbaplus Community
dbaplus Community
Jan 25, 2018 · Cloud Native

How to Build a Lightweight Private Cloud with Docker and Ansible

This article explains the challenges of lightweight private‑cloud deployment, classifies distributed‑system types, and presents a practical solution that combines a standard OS layer, Docker containerization, and Ansible automation, illustrated with a real‑world RabbitMQ cluster example and supporting GitHub resources.

AnsibleDeploymentDocker
0 likes · 18 min read
How to Build a Lightweight Private Cloud with Docker and Ansible
Efficient Ops
Efficient Ops
Nov 28, 2017 · Fundamentals

Master Python Fast: A Practical Guide for Ops Engineers to Automate Tasks

This comprehensive tutorial walks operations engineers through Python fundamentals, from installation and basic syntax to data structures, functions, modules, and debugging, illustrating each concept with clear examples and diagrams to enable rapid automation development in real‑world DevOps environments.

AutomationOps
0 likes · 31 min read
Master Python Fast: A Practical Guide for Ops Engineers to Automate Tasks
MaGe Linux Operations
MaGe Linux Operations
Oct 24, 2017 · Operations

Top 20 Python Libraries Every Sysadmin Should Know

This article lists and briefly describes twenty essential Python libraries—from psutil and Ansible to SaltStack and scapy—that empower system administrators to monitor resources, automate tasks, manage configurations, and build robust DevOps workflows.

DevOpsOpslibraries
0 likes · 5 min read
Top 20 Python Libraries Every Sysadmin Should Know
Efficient Ops
Efficient Ops
Sep 23, 2017 · Operations

Why Ops Teams Feel Stuck: 6 Common Pitfalls and How to Fix Them

The article explores why operations professionals often feel exhausted, unrecognized, and low, identifying six systemic shortcomings—lack of a holistic ops framework, unclear positioning, closed mindset, insufficient authority, stagnant improvement, and missing cultural integration—and offers practical guidance to transform these weaknesses into strengths.

OpsTeam Culture
0 likes · 8 min read
Why Ops Teams Feel Stuck: 6 Common Pitfalls and How to Fix Them
Efficient Ops
Efficient Ops
Sep 10, 2017 · Operations

How We Built a Scalable, High‑Availability Monitoring Platform with Service Trees

This article details the challenges of traditional monitoring systems, the design and implementation of a custom high‑availability monitoring platform using a Golang‑based service tree, Raft‑backed storage, InfluxDB for time‑series data, and a modular architecture that supports Windows agents, third‑party reporting, and AI‑driven future enhancements.

InfluxDBOpsaiops
0 likes · 13 min read
How We Built a Scalable, High‑Availability Monitoring Platform with Service Trees
MaGe Linux Operations
MaGe Linux Operations
Sep 9, 2017 · Operations

Master Ansible Basics: Essential Modules and Commands for Automation

This guide walks you through Ansible's core architecture, host inventory setup, variable definitions, and the most commonly used modules—including group, user, copy, cron, shell, and ping—showing practical command examples and how to retrieve module help with ansible‑doc.

AnsibleAutomationConfiguration Management
0 likes · 10 min read
Master Ansible Basics: Essential Modules and Commands for Automation
DevOps
DevOps
Jul 12, 2017 · Cloud Native

Container Monitoring: Challenges, Metrics Collection, and Best Practices

This article examines the unique challenges of monitoring containers, outlines three categories of metrics to collect, compares host‑centric and layered monitoring architectures, provides detailed methods for gathering CPU, memory, I/O and network data via cgroup files and Docker commands, and shares practical insights, tooling recommendations, and a Q&A session for effective container observability.

DockerOpsPrometheus
0 likes · 18 min read
Container Monitoring: Challenges, Metrics Collection, and Best Practices
MaGe Linux Operations
MaGe Linux Operations
May 10, 2017 · Operations

Step‑by‑Step: Monitor Nginx and PHP‑FPM Status with Zabbix

This guide walks through configuring Zabbix to monitor Nginx and PHP‑FPM status, covering software installation paths, enabling status modules, creating extraction scripts, setting up Zabbix agent userparameters, restarting services, testing data retrieval, and adding server‑side templates for items, triggers, and graphs.

LinuxNGINXOps
0 likes · 9 min read
Step‑by‑Step: Monitor Nginx and PHP‑FPM Status with Zabbix
MaGe Linux Operations
MaGe Linux Operations
Jan 5, 2017 · Operations

Mastering Puppet: Automate Server Deployment and Configuration

This article explains how Puppet automates large‑scale server provisioning by describing its architecture, workflow, manifest examples, class inheritance, and module structure, helping operations teams reduce manual effort and avoid errors in configuration management.

AutomationConfiguration ManagementInfrastructure as Code
0 likes · 8 min read
Mastering Puppet: Automate Server Deployment and Configuration
MaGe Linux Operations
MaGe Linux Operations
Nov 14, 2016 · Operations

Master Ansible: From Basics to Advanced Automation without Agents

This comprehensive guide introduces Ansible, explains its agentless architecture, core components, installation, SSH key setup, inventory configuration, essential commands, and common modules, providing a practical roadmap for automating system administration and deployment tasks.

AnsibleAutomationConfiguration Management
0 likes · 17 min read
Master Ansible: From Basics to Advanced Automation without Agents
Qunar Tech Salon
Qunar Tech Salon
Nov 8, 2016 · Operations

Building a Scalable Elasticsearch-as-a-Service Platform on Mesos, Marathon, and Docker at Qunar

This article describes how Qunar's operations team designed and implemented a cloud‑native Elasticsearch‑as‑a‑Service platform using Mesos, Marathon, and Docker, covering requirements analysis, technology selection, resource quota management, cluster isolation, service discovery, data reliability, monitoring, automated deployment, and future improvements.

DockerElasticsearchMarathon
0 likes · 17 min read
Building a Scalable Elasticsearch-as-a-Service Platform on Mesos, Marathon, and Docker at Qunar
21CTO
21CTO
Mar 16, 2016 · Backend Development

How Badoo Saved $1M by Migrating Hundreds of Servers to PHP 7

Badoo migrated its massive PHP codebase to PHP 7 across hundreds of servers, overcoming engine bugs, HHVM limitations, and extension incompatibilities, while revamping testing infrastructure and deployment processes, ultimately achieving up to 40% faster response times, eight‑fold memory reduction, and roughly one million dollars in cost savings.

BackendOpsmigration
0 likes · 22 min read
How Badoo Saved $1M by Migrating Hundreds of Servers to PHP 7
dbaplus Community
dbaplus Community
Jan 25, 2016 · Operations

Mastering Application Performance Diagnosis: Layered & Segment Approaches

This article outlines a comprehensive performance testing workflow, introduces layered and segment diagnostic methods, presents a detailed Apache/Tomcat/Linux/Oracle case study with LoadRunner and Nmon, and discusses monitoring metrics, analysis results, and practical recommendations for optimizing system performance.

Opsapplication monitoringdiagnostics
0 likes · 14 min read
Mastering Application Performance Diagnosis: Layered & Segment Approaches
Efficient Ops
Efficient Ops
Dec 20, 2015 · Operations

What Makes a Truly Effective Ops Engineer and Architect?

This article outlines the essential skills, mindset, and learning ability required for a qualified operations engineer and details the four key competencies—communication, emergency response, continuous reflection, and strong learning—that define an outstanding ops architect.

EngineeringITILOps
0 likes · 9 min read
What Makes a Truly Effective Ops Engineer and Architect?
MaGe Linux Operations
MaGe Linux Operations
Jun 16, 2015 · Operations

Inside Dianping’s Ops: Building Scalable Monitoring, Automation, and Self‑Service Platforms

This article details how Dianping’s sub‑40‑person operations team structures its groups, designs a dual‑datacenter architecture, and creates comprehensive monitoring, automation, configuration, and analysis systems—including Zabbix, Cat, workflow, Button, and a custom radar platform—to achieve high‑availability, self‑service, and continuous improvement.

AutomationDevOpsInfrastructure
0 likes · 18 min read
Inside Dianping’s Ops: Building Scalable Monitoring, Automation, and Self‑Service Platforms