Operations 19 min read

How a Chinese Telecom Built an Automated MySQL Management Platform

This article outlines the evolution from manual MySQL DBA tasks to a fully automated, platform‑based solution at China Mobile’s "Mobile Cloud", detailing standardization, tooling, Ansible‑driven deployment, platform architecture, and key features such as backup, inspection, user management, and SQL review.

dbaplus Community
dbaplus Community
dbaplus Community
How a Chinese Telecom Built an Automated MySQL Management Platform

China Mobile’s "Mobile Cloud" serves roughly 150,000 customers with about ten cloud products, and its database instances have doubled in two years while DBA headcount remained flat. To cope, the team transitioned from manual operations to a standardized, tool‑based, and eventually fully automated platform called the EcloudDB Database Management Platform.

1. Standardization

Standardization was the foundation, covering OS‑level settings (kernel, RAID, NTP, SELinux, NUMA, etc.) and MySQL configurations (directory layout, user permissions, my.cnf parameters, account policies, logging, thread pools, backup tools, cron jobs). A unified directory structure enables scripts to perform batch operations across instances.

2. Tooling and Scripting

Early automation leveraged Python and web technologies, then incorporated open‑source tools:

Zabbix for comprehensive monitoring and alerting.

ELK stack for log collection, analysis, and fault tracing.

Ansible for batch deployment and configuration.

Percona Toolkit and XtraBackup for log parsing, online schema changes, and physical backups.

Custom scripts were also created for periodic inspection, backup validation, data consistency checks, and standardization verification.

3. Automated Deployment Process

The deployment workflow uses Ansible roles and playbooks:

Allocate physical or virtual machines based on the deployment plan.

Install Ansible on the control host.

Perform OS initialization (OS version, filesystem, RAID, NTP, SELinux, etc.).

Upload the Ansible role package and the standardized bcrdb package to the appropriate directories.

Prepare an inventory file with cluster node addresses and business‑specific parameters (max_connections, innodb_buffer_pool_size, wait_timeout).

Execute the playbook, e.g., ansible‑playbook -i host playbooks/bcrdb.yml, which pushes the database, runs standardization checks, and registers the instance.

4. Platform Architecture

The platform follows a three‑layer design:

Interaction layer : Web UI and RESTful APIs for user actions.

Service layer : Django‑based backend handling business logic, asynchronous tasks via Celery, and routing to appropriate views.

Resource layer : Data services (MySQL, Redis), OS resources, and auxiliary components (logging, monitoring).

Key technologies include Vue.js for the frontend, Django for the backend, Redis for caching, MySQL for persistent storage, Django‑Celery for long‑running tasks, and Ansible for remote operations.

5. Core Platform Features

Index dashboard showing instance health, performance metrics (TPS, QPS, thread count) and backup status.

User management with role‑based permissions and a workflow for DB account requests.

Backup management supporting on‑demand snapshots, point‑in‑time recovery, and configurable retention policies.

Inspection module offering one‑click batch checks, a library of 40+ health items, and customizable checks.

SQL review using the open‑source YearningSQL engine with added custom rules.

Instance management for creating databases, tables, stored procedures, triggers, and handling permission changes.

6. Future Roadmap

Phase 2 will add log management, slow‑query analysis, lock‑conflict diagnostics, advanced information queries, and enhanced alerting.

7. Lessons Learned

Keep designs simple and avoid over‑engineering; adopt agile, incremental delivery; manage requirement changes carefully; and enforce rigorous testing before production rollout to ensure stability and performance.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Automationplatform architecturemysqlDatabase operationsAnsible
dbaplus Community
Written by

dbaplus Community

Enterprise-level professional community for Database, BigData, and AIOps. Daily original articles, weekly online tech talks, monthly offline salons, and quarterly XCOPS&DAMS conferences—delivered by industry experts.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.