Why MyCat’s Pseudo‑Distributed MySQL Solution Fails and What to Do Instead
The article examines MyCat’s middleware‑based pseudo‑distributed MySQL architecture, outlines its storage, scalability, and reliability shortcomings, walks through common solutions like disk expansion, compression, and sharding, and finally offers practical steps and alternative technologies for building truly distributed database systems.
Background
Distributed databases are rapidly evolving to meet the ever‑growing data volume and transaction load of modern internet services. Traditional MySQL, being a single‑node database, inevitably hits storage bottlenecks when data scales beyond a few terabytes.
Three Basic Ways to Address Storage Limits
Increase Disk Capacity – Adding more disks (e.g., from 800 GB to 2 TB or 5 TB) is the simplest fix, but it raises operational concerns such as backup, recovery time, and DBA workload for massive instances.
Data Compression – InnoDB’s native compression can reduce storage to one‑third or half of the original size, at the cost of some performance degradation, especially for latency‑sensitive workloads.
Data Sharding – Splitting data across multiple MySQL instances (or other stores like HBase, Redis) provides the most scalable solution, though it introduces complexity in routing, metadata management, and consistency.
Requirements for a Distributed Solution
Scalability – Ability to add nodes without affecting existing services.
Transactional Support – Distributed transactions must be preserved.
Full SQL Compatibility – Applications should continue to use standard MySQL statements.
Performance – Overhead of distribution should be minimal.
Metadata Change Transparency – Schema changes must propagate safely across shards.
Underlying DB High Availability – The base MySQL cluster must guarantee consistency and failover.
Popular Middleware: MyCat
MyCat is a widely discussed MySQL middleware that claims to provide automatic sharding, aggregation, and load balancing. Its architecture is illustrated below:
Despite the appealing diagram, several critical issues arise:
Routing Logic – MyCat relies on a static schema.xml file to map tables to shards, making dynamic routing and schema evolution cumbersome.
Rebalancing – Adding nodes typically requires manual data export/import and a reload of the configuration, a process that can overwhelm DBAs.
Global Tables – Implemented by creating identical tables on every shard, which raises consistency and performance concerns as shard count grows.
Distributed Transactions – MyCat depends on MySQL XA, which is rarely used in production due to performance and reliability issues.
Failover – Automatic backend failover is limited to the node’s own view, risking split‑brain scenarios.
Backup & Recovery – Each shard must be backed up individually; restoring a consistent snapshot across all shards is non‑trivial.
Configuration Complexity – The XML configuration is verbose and error‑prone, discouraging adoption.
Practical Migration Steps if MyCat Becomes Too Risky
Stop all write traffic.
Export all databases using logical tools such as mysqldump to generate .sql files.
Choose a robust MySQL architecture (e.g., a true distributed database or a shared‑nothing cluster) and import the dumps.
Migrate read traffic to the new system.
Finally, migrate write traffic and bring the new cluster online.
This process can take days for large datasets due to the inherent slowness of logical backups.
Alternative Distributed Database Solutions
For truly distributed, MySQL‑compatible systems, consider mature products such as Google Spanner, F1, TiDB, or SequoiaDB, which provide native sharding, strong consistency, distributed transactions, and minimal application changes.
Conclusion
MyCat remains popular because it is open source and free, but its pseudo‑distributed nature introduces many operational pitfalls. Organizations should evaluate whether a genuine distributed database better fits their scalability, reliability, and performance requirements.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
dbaplus Community
Enterprise-level professional community for Database, BigData, and AIOps. Daily original articles, weekly online tech talks, monthly offline salons, and quarterly XCOPS&DAMS conferences—delivered by industry experts.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
