How to Build a Scalable Automated MySQL Operations Platform
This article explains how to standardize and automate MySQL operations—including multi‑instance deployment, metadata collection, monitoring, backup, and high‑availability using Zookeeper—so that large‑scale database services can be provisioned, managed, and scaled with minimal human intervention.
Introduction
In the Internet era, databases must meet agile development and delivery requirements. Traditional DBA‑driven processes cannot keep up with large numbers of business demands.
Database Usage Patterns
Two main patterns are described: mixed‑deployment (shared instance) and dedicated‑instance per application. The shared approach is quick but leads to interference, lack of performance metrics, difficult troubleshooting, and inefficient resource use, often requiring costly manual sharding.
Dedicated instances solve these problems by giving each application its own MySQL instance (or sharded tables), but this requires many instances and a shift from single‑instance to multi‑instance operations.
Problems to Address
How to quickly provision databases, allocate resources rationally, monitor databases, and provide high‑availability for multiple instances?
MySQL Standardization and Automation
Our automated MySQL ops system handles large‑scale pain points such as instance creation, deployment, monitoring, backup, HA switching, migration, and scaling with minimal human intervention.
Standardization
Application‑dedicated instances
Master‑only data flow; replicas do not serve traffic (except for disaster recovery)
Consistent MySQL version and configuration (identical my.cnf except server_id, buffer_pool_size, etc.)
Uniform directory layout
Automation Framework (DBManage)
A comprehensive large‑scale ops platform that automates everything without requiring trust relationships between machines.
Multi‑Instance Creation
Run multiple MySQL processes on different ports of a single host, sharing the binary but using separate configuration files and data directories.
Identical my.cnf (except unique parameters)
Standardized data directory
Instance creation via rsync and custom my.cnf to avoid heavy I/O from mysql_install_db
Metadata and Monitoring
Custom
db_agentruns on each server, automatically discovers instances, collects metadata, performance metrics (QPS, TPS, RT), heartbeat, slow SQL, and host metrics, eliminating manual entry and external tools like Zabbix.
Backup
Backup scripts deployed on all database machines handle backup of replicas, concurrent multi‑instance control, and direct storage to HDFS or a server.
Remote Execution Agent
Allows remote creation, recovery, and other operations by invoking scripts on the target DB machine.
Monitoring and Alerting
Metrics collected by
db_agentare visualized in Graphite and trigger alerts.
MySQL High Availability
Instead of traditional MHA with VIP, we use Zookeeper‑based HA with VDDS, removing VIP and MHA components. Instances register temporary nodes in Zookeeper;
ha_consolebuilds switch metadata and performs failover based on delay and replication status.
HA workflow includes VDDS configuration, ha_agent registration, ha_console metadata construction, node monitoring, and automatic switch with logging.
{
"source_db_role": "slave",
"master_instance": "192.168.1.12_3306",
"repl_status": "ok",
"h_time_delay": 0,
"repl_delay": 0,
"last_change_time": "2016-10-15-01:00:45"
}DBTask Automation
DBTask replaces manual steps, breaking complex actions (e.g., database migration) into discrete scripts that are chained together.
Database migration steps: request resources, create instance, restore replicas, configure VDDS, pre‑switch checks, switch, clean VDDS, shut down old instance.
Results and Outlook
The full automation stack uses Python, Shell, and Go (for slow‑SQL parsing) on the backend and Laravel + AngularJS on the frontend. A single host runs over 100 instances; provisioning takes less than a minute; automated sharding is in progress, moving the database toward a private‑cloud platform.
References
Python socket communication: https://github.com/chris-piekarski/python-json-socket
Python HDFS client: https://pypi.python.org/pypi/hdfs/
RT measurement tool: https://github.com/Lowercases/tcprstat
Python Zookeeper client: https://github.com/python-zk/kazoo
Go TiDB SQL parser: https://github.com/pingcap/tidb
Weidian Tech Team
The Weidian Technology Platform is an open hub for consolidating technical knowledge. Guided by a spirit of sharing, we publish diverse tech insights and experiences to grow and look ahead together.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.