Mastering MySQL Architecture: Standards, HA, Sharding, and Redis Integration
This article outlines comprehensive MySQL best practices, covering development and operational standards, high‑availability architecture choices such as Keepalived, MHA and Percona XtraDB Cluster, sharding strategies (vertical and horizontal), and how Redis can be leveraged to offload read pressure.
1. MySQL Development Standards
Good development standards are the foundation of reliable database operations. They help reduce the probability of bugs, ensure reasonable schema design, and facilitate later automation.
Limit the number of columns per table to 20‑50.
Keep integer fields under 15 million rows; avoid large CHAR columns.
Store IP addresses as UNSIGNED INT and dates as numeric timestamps when possible.
Define all columns as NOT NULL unless a NULL value is truly required.
Index design guidelines
Every table must have an explicit primary key (InnoDB stores rows sorted by the PK).
Prefer short, auto‑incrementing columns for indexes.
Use ROW format for replication when a primary key exists.
Avoid TINYINT as a primary key; it can cause crashes.
Prefer uuid_short() over uuid() and store the result as BIGINT.
When creating indexes, choose high‑cardinality columns, keep the number of indexed columns ≤ 5, and limit total indexes per table to ≤ 5. Aim for indexes that cover about 80 % of the most common queries.
SQL writing guidelines
Avoid heavy calculations inside the database; split large transactions into smaller batches.
Use cache for frequently accessed dictionary tables.
Limit joins to two tables when possible and let the smaller table drive the join.
Prefer high‑selectivity columns in WHERE clauses and verify plans with EXPLAIN.
When subqueries appear, check MySQL version compatibility and execution plan.
2. Operational Standards
SQL review
Manual SQL review is slow and error‑prone. The author integrates the open‑source tool Inception (originally from Qunar) and customizes it for internal needs. Other popular tools include pt‑osc for online DDL and various commercial solutions.
When performing online DDL, be aware of differences between MySQL’s native online DDL and Percona’s pt‑osc. pt‑osc copies the whole table on each operation (slower but non‑locking), while native online DDL may cause MDL locks.
Permission control
Since MySQL 5.6, the privilege system has been enhanced (password strength plugins, expiration, account lock, SSL improvements). MySQL 8.0 adds role support. Use tools like pt-show-grants to audit privileges. Application accounts should only have SELECT/INSERT/UPDATE; DELETE should be implemented via UPDATE, and enable sql_safe_updates.
SQL firewalls (e.g., a customized MyWebSQL‑based gateway or a Python‑written firewall supporting MySQL, Oracle, Greenplum) provide fine‑grained access control, audit, and syntax checking.
MySQL version selection
Community Edition – largest user base.
Enterprise Edition – commercial support.
Percona Server – many new features, closest to community edition.
MariaDB – less popular in China.
Recommended priority: Community > Percona > MariaDB > Enterprise. Stable releases such as MySQL 5.6 are safe defaults; consider 5.7, PXC, TiDB, TokuDB for special needs.
3. High‑Availability Architecture Options
The industry still relies heavily on asynchronous replication tools like Keepalived, MHA, and ZooKeeper. For strong consistency, distributed protocols such as Percona XtraDB Cluster (PXC), Group Replication, and TiDB are gaining traction.
Keepalived
Simple to deploy and maintain, it saves server resources. After a failover, the former master can be restarted quickly. However, it has detection gaps, split‑brain risks, and weaker data consistency.
MHA
MHA automatically fills binlogs after a failover, ensuring higher data consistency. It works well in read‑write split scenarios and is scriptable in Perl. It requires SSH trust between nodes and may be paired with a Binlog Server.
Percona XtraDB Cluster (PXC)
PXC implements the Galera protocol, sacrificing partition tolerance (P) while keeping consistency (C) and availability (A). It offers synchronous replication, multi‑master writes, parallel replication, and near‑zero downtime.
Solves replication lag and split‑brain issues.
Provides strong consistency.
Supports multi‑master writes.
Parallel replication improves throughput.
High availability – single node failure does not affect the cluster.
Automatic node provisioning.
Almost fully compatible with vanilla MySQL.
Considerations: avoid large transactions, performance limited by the slowest node, higher network requirements (10 GbE recommended), possible lock contention and deadlocks under heavy concurrent writes.
Other HA solutions based on DNS or ZooKeeper exist but usually require custom development and are suited for very large clusters.
4. MySQL Sharding Strategies
When data volume and traffic grow, splitting databases becomes inevitable. Two main approaches are vertical and horizontal sharding.
Vertical sharding separates unrelated modules (e.g., activity logs vs. core business tables) into different databases. Advantages: simple rules, clear module boundaries, easier maintenance. Disadvantages: cross‑database joins must be handled in application code, transaction complexity increases, potential hotspot tables remain.
Horizontal sharding distributes rows of a large table across multiple physical tables or databases. Advantages: no impact on joins/transactions, can handle massive tables and high load, minimal application changes. Disadvantages: aggregated queries become harder, sharding rules are complex, migration is more involved.
Decision between splitting by database first or table first depends on use‑case: database splitting is easier but cannot solve per‑table size limits; table splitting solves size issues but adds complexity.
Implementation can be done directly in application code for small scale, or via middleware for larger deployments. Open‑source MySQL middleware includes Atlas, DBProxy, MyCAT, OneProxy, DRDS, Vitess, etc. Evaluate maturity and test thoroughly before adoption.
5. Using NoSQL (Redis) to Relieve MySQL
Redis is widely used to offload read pressure from MySQL. It stores data in memory for ultra‑fast access, supports bulk operations, and offers rich data structures.
Cache frequently accessed data; on a miss, read from MySQL and write back to Redis.
Use key/value pairs for user profiles, global rankings, statistics, etc.
Leverage hashes, sorted sets, and lists to implement counters, leaderboards, or lightweight message queues.
Important cautions:
Do not mix cache and persistent storage responsibilities; Redis persistence is not a replacement for a durable database.
Avoid using Redis as the sole storage for large datasets; memory exhaustion leads to service stalls.
Watch for cache‑penetration, cache‑avalanche, and hot‑key rebuild issues.
Conclusion
Designing a large‑scale MySQL architecture is an iterative process that balances performance, reliability, and operational cost. By adhering to solid development and operational standards, selecting an appropriate HA solution, applying thoughtful sharding, and optionally integrating Redis for caching, teams can build robust systems that meet business demands while keeping maintenance manageable.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
dbaplus Community
Enterprise-level professional community for Database, BigData, and AIOps. Daily original articles, weekly online tech talks, monthly offline salons, and quarterly XCOPS&DAMS conferences—delivered by industry experts.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
