Databases 15 min read

Understanding Distributed Architecture and Its Applications in MySQL and Large‑Scale Systems

The article explains the concept of distributed architecture, its key characteristics such as cohesion and transparency, showcases how MySQL and middleware like Mycat are used in e‑commerce platforms, and outlines the evolution, practical implementations, and challenges of building scalable distributed database systems.

Architecture Digest

Apr 18, 2018

Understanding Distributed Architecture and Its Applications in MySQL and Large‑Scale Systems

MySQL is widely used in e‑commerce and internet companies because it is free, open‑source, and supports horizontal scalability through distributed systems; as mobile‑internet users explode, companies like Taobao, Tmall and Vipshop adopt distributed architectures to handle high concurrency and massive data storage.

1. What Is Distributed Architecture

A distributed system is software built on top of a network.

**Cohesion**: each database node is highly autonomous and runs its own DBMS.

**Transparency**: to the user, a node appears the same whether it is local or remote.

In a distributed data system the user does not need to know whether data is partitioned, replicated, where it resides, or on which node a transaction executes.

In simple terms, a group of independent computers presents itself as a single unified system to the user.

The distributed system provides services as a whole while its internal collaboration remains transparent to the user, who simply interacts with it as if it were a single MySQL instance.

Middleware such as Mycat is a typical distributed‑MySQL solution that handles massive concurrency and data volume.

2. Applications of Distributed Architecture

1. Distributed file systems – e.g., Hadoop HDFS, Google GFS, Taobao TFS.

2. Distributed caching – e.g., Memcached, HBase, MongoDB.

3. Distributed databases – e.g., MySQL, MariaDB, PostgreSQL.

4. Distributed Web services.

5. Distributed computing.

Example: Mycat Middleware for Distributed MySQL

MySQL’s popularity in e‑commerce stems from its open‑source nature and its ability to scale horizontally in distributed environments. In real‑world cases, Mycat has processed 200 million records per day for China Mobile’s billing system and 2.6 billion records for an IoT project, providing real‑time query interfaces.

Studying Mycat deepens understanding of distributed architecture, including ZooKeeper for consistency, HAProxy/keepalived for high availability, and other related technologies.

Key topics include:

1> Cluster vs. Distributed

2> Load balancing

3> High‑availability and disaster‑recovery concepts

4> Learning Mycat middleware

3. Evolution of Distributed Architecture

(1) Initial stage

Feature: All resources (applications, databases, files) reside on a single server.

(2) Separation of application, data, and file services

Explanation: As traffic grows, additional web servers are added.

Feature: Applications, databases, and files are deployed on independent resources.

(3) Introducing caching to improve performance

Explanation: 80 % of accesses concentrate on 20 % of data (Pareto principle).

Caching can be local (fast but limited) or remote distributed (larger capacity).

Feature: Frequently accessed data is stored in cache servers, reducing database load.

(4) Application‑server clustering

Explanation: After sharding, database pressure drops, but web‑server bottlenecks appear as request volume grows.

Feature: Multiple servers provide services simultaneously through load balancing, overcoming single‑machine limits.

Description: Clustering is a common method to handle high concurrency and massive data by adding resources.

(5) Database read/write separation

Explanation: After a period of rapid growth, write‑intensive operations cause resource contention, slowing the system.

Feature: Multiple servers share load via load balancers.

Description: Clustering remains the primary solution for high‑concurrency scenarios.

(6) Reverse proxy and CDN acceleration

Feature: CDN and reverse proxy speed up access.

Description: Caching at the edge reduces latency and offloads backend servers.

(7) Distributed file and database systems

Explanation: As data volume grows, sharding alone becomes insufficient; splitting tables is required.

Feature: Databases become distributed; file systems become distributed.

Description: Single servers can no longer meet business growth; distributed databases and file systems are needed.

Distributed databases are used only when a single table reaches massive size; more common is business‑level sharding across multiple physical servers.

(8) Using NoSQL and search engines

Feature: Introduction of NoSQL databases and search engines.

Description: Complex business scenarios demand flexible storage and retrieval, leading to adoption of non‑relational databases and search technologies.

Application servers access various data sources through a unified data‑access layer, simplifying management.

(9) Business‑level decomposition

Feature: System is refactored by business domains; application servers are deployed per business.

Description: To cope with increasing complexity, the system is split into independent services, communicating via hyperlinks, message queues, or shared storage.

Vertical split: Large applications are broken into smaller, relatively independent web apps.

Horizontal split: Reusable business functions are extracted as distributed services with defined interfaces.

(10) Distributed services

Feature: Common modules are extracted and deployed on distributed servers for reuse.

Description: As services proliferate, inter‑service dependencies become tangled, leading to resource exhaustion and service failures.

4. Problems Faced by Distributed Services

(1) Managing an ever‑growing number of service URLs becomes difficult, and hardware load balancers become single points of pressure.

(2) Service dependency graphs become complex, making startup order and architecture description hard for architects.

(3) Increasing call volume raises capacity questions: how many machines are needed and when to scale out?

(4) Communication overhead rises; troubleshooting failures and understanding service contracts become challenging.

(5) Multiple consumers per service raise concerns about quality of service.

(6) Upgrades can cause unexpected issues such as cache corruption, memory leaks, or cascading failures; strategies like degradation or resource throttling are needed.

Source: http://stor.51cto.com/art/201804/569635.htm

Copyright notice: Content originates from the web; rights belong to the original author. We strive to credit sources; please inform us of any infringement.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Distributed Systems Big Data Scalability Database Architecture mysql Mycat

Written by

Architecture Digest

Focusing on Java backend development, covering application architecture from top-tier internet companies (high availability, high performance, high stability), big data, machine learning, Java architecture, and other popular fields.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.