Databases 26 min read

MySQL Single‑Table Optimization, Partitioning, Sharding and Scaling Strategies

When a MySQL table grows large, CRUD performance degrades, so this article presents comprehensive optimization techniques—including field design, indexing, query rewriting, engine selection, system parameters, hardware upgrades, read‑write separation, caching, partitioning, vertical and horizontal splitting, sharding architectures, and recommendations for compatible scalable databases—to help maintain high performance and scalability.

Java Captain
Java Captain
Java Captain
MySQL Single‑Table Optimization, Partitioning, Sharding and Scaling Strategies

When MySQL single‑table record counts become large, CRUD performance sharply declines; the article outlines steps to optimize.

Single Table Optimization

Do not split a table unless data will continuously grow; splitting adds logical, deployment and operational complexity. Integer‑based tables are generally fine up to 10 million rows, while string‑based tables are acceptable up to 5 million rows, though many tables can still perform well beyond these thresholds.

Fields

Prefer TINYINT, SMALLINT, MEDIUMINT over INT; add UNSIGNED when values are non‑negative.

Allocate only the necessary length for VARCHAR.

Use enums or integers instead of strings.

Prefer TIMESTAMP rather than DATETIME.

Keep the number of columns under 20.

Avoid NULL columns as they hinder index optimization and consume extra space.

Store IP addresses as integers.

Indexes

Create indexes only on columns used in WHERE or ORDER BY; verify usage with EXPLAIN.

Do not test NULL values in WHERE clauses, as this forces full table scans.

Do not index low‑cardinality columns (e.g., gender).

Use prefix indexes for character columns and avoid making them primary keys.

Avoid foreign keys; enforce constraints in application code.

Minimize use of UNIQUE constraints; enforce uniqueness in code.

When using composite indexes, keep column order consistent with query conditions and drop unnecessary single‑column indexes.

SQL Queries

Enable slow‑query log to identify expensive statements.

Avoid column operations (e.g., SELECT id WHERE age + 1 = 10) because they cause table scans.

Keep SQL simple; one CPU core per statement, split large statements, and avoid SELECT *.

Replace OR with IN for better performance (log‑scale vs linear).

Do not use functions or triggers; implement logic in the application.

Avoid %xxx pattern matching.

Minimize JOIN usage.

Compare values of the same type (e.g., string vs string, number vs number).

Do not use != or <> in WHERE clauses; they prevent index usage.

For continuous ranges, use BETWEEN instead of IN.

Paginate results with LIMIT and keep page size reasonable.

Engine

Two widely used storage engines are MyISAM and InnoDB.

MyISAM

No row locking; table‑level read lock and exclusive write lock.

No transaction support, foreign keys, or crash‑safe recovery.

Supports BLOB/TEXT prefix indexes, full‑text indexes, delayed index updates, and table compression.

InnoDB

Row locking with MVCC for high concurrency.

Supports transactions, foreign keys, and crash‑safe recovery.

No full‑text index support.

Overall, MyISAM suits SELECT ‑heavy workloads, while InnoDB is better for INSERT and UPDATE intensive tables.

System Tuning Parameters

Benchmark tools: sysbench, iibench‑mysql, tpcc‑mysql. back_log: increase from default 50 to 500 to allow more pending connections. wait_timeout: reduce idle connection time from 8 hours to 30 minutes. max_user_connection: set a reasonable upper limit. thread_concurrency: set to twice the CPU core count. skip_name_resolve: disable DNS lookup for client connections. key_buffer_size: enlarge for MyISAM index caching (e.g., 256 MB‑384 MB on a 4 GB server) and keep key_reads / key_read_requests below 0.1 %. innodb_buffer_pool_size: increase to improve InnoDB data and index caching; aim for a high

(Innodb_buffer_pool_read_requests – Innodb_buffer_pool_reads) / Innodb_buffer_pool_read_requests

ratio. innodb_additional_mem_pool_size, innodb_log_buffer_size, query_cache_size, read_buffer_size, sort_buffer_size, read_rnd_buffer_size, record_buffer, thread_cache_size, table_cache: adjust according to workload and monitor related status variables.

Hardware Upgrade

Scale‑up by adding CPU, memory, or SSDs, depending on whether MySQL is CPU‑ or I/O‑bound.

Read‑Write Separation

Use a master for writes and a slave for reads; avoid dual‑master setups to keep complexity low.

Cache

Caching can be applied at several layers:

MySQL internal caches (see system parameters above).

Data‑access layer (e.g., MyBatis cache, Hibernate Persistence Object cache).

Application service layer (caching Data Transfer Object).

Web layer (page caching).

Browser/client side.

Table Partition

MySQL 5.1 introduced partitioning as a simple form of horizontal sharding; it is transparent to applications but splits a logical table into multiple physical sub‑tables.

Benefits include larger logical tables, easier maintenance, faster queries that hit few partitions, and the ability to place partitions on different storage devices. Limitations: max 1024 partitions, primary/unique keys must include the partition column, no foreign keys, NULL values break partition pruning, and all partitions must use the same engine.

Partition Types

RANGE – based on continuous intervals.

LIST – based on discrete value sets.

HASH – based on a user‑defined expression.

KEY – similar to HASH but uses MySQL’s internal hash function on integer columns.

Suitable Scenarios

Time‑series data is a common use case; partition by year, month, etc.

CREATE TABLE members (
    firstname VARCHAR(25) NOT NULL,
    lastname VARCHAR(25) NOT NULL,
    username VARCHAR(16) NOT NULL,
    email VARCHAR(35),
    joined DATE NOT NULL
) PARTITION BY RANGE ( YEAR(joined) ) (
    PARTITION p0 VALUES LESS THAN (1960),
    PARTITION p1 VALUES LESS THAN (1970),
    PARTITION p2 VALUES LESS THAN (1980),
    PARTITION p3 VALUES LESS THAN (1990),
    PARTITION p4 VALUES LESS THAN MAXVALUE
);

Vertical Splitting

Separate tables by logical groups (e.g., user data vs. order data). Advantages: smaller rows per block, better cache utilization, simpler maintenance. Disadvantages: redundant primary keys, increased JOINs, still limited by single‑table size, and more complex transaction handling.

Horizontal Splitting

Distribute rows across multiple tables or databases (sharding). Advantages: eliminates single‑node bottlenecks, minimal application changes, higher stability and load capacity. Disadvantages: cross‑node transaction consistency, slower cross‑node JOINs, and higher operational complexity.

Sharding Principles

Split only when necessary; follow single‑table optimization first.

Keep the number of shards low and distribute them evenly.

Choose shard rules based on growth and access patterns; range, enum, and consistent‑hash are common.

Avoid cross‑shard transactions.

Write queries should be selective; avoid SELECT * on large result sets.

Use data redundancy and partitioning to reduce cross‑shard JOINs.

Solutions

Two architectural models: client‑side and proxy‑side.

Client Architecture

Modify the data‑access layer (JDBC, DataSource, MyBatis) to manage multiple data sources directly in the application. Example diagram (image omitted).

Pros: direct DB connection, lower external dependency risk, low integration cost.

Cons: limited to the data‑access layer, added load on application servers.

Proxy Architecture

Deploy an independent middleware that aggregates data sources; the application connects to the proxy as if it were a single DB. Example diagram (image omitted).

Pros: handles complex requirements, transparent to applications, strong extensibility.

Cons: requires separate deployment and maintenance, adds an extra network hop.

Comparison Table

Product

Vendor

Model

Supported DB

Sharding

Read/Write Separation

Open‑Source

Language

MySQL Fabric

MySQL

Proxy

MySQL

Yes

Yes

Yes

Python

Cobar

Alibaba

Proxy

MySQL

Yes

No

Yes

Java

ShardingJDBC

Dangdang

Client

MySQL

Yes

Yes

Yes

Java

Choosing a Solution

Decide between proxy and client architecture (client for small/medium, proxy for large/complex).

Check required features (e.g., ORDER BY support).

Avoid products with no updates in the past year.

Prefer solutions from large companies → community → small firms → individuals.

Consider popularity (GitHub stars, user feedback).

Prefer open‑source for custom modifications.

Client architecture recommendation: ShardingJDBC .

Proxy architecture recommendation: MyCat or Atlas .

MySQL‑Compatible Horizontally Scalable Databases

TiDB

Cubrid

Cloud services: Alibaba PetaData, Alibaba OceanBase, Tencent DCDB.

NoSQL

For workloads that do not require strict ACID guarantees, consider moving large, weakly‑structured tables to NoSQL solutions (e.g., logs, monitoring data, unstructured data).

PS: If you find this sharing useful, feel free to like and watch.

END

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

performance tuningmysqlDatabase OptimizationPartitioning
Java Captain
Written by

Java Captain

Focused on Java technologies: SSM, the Spring ecosystem, microservices, MySQL, MyCat, clustering, distributed systems, middleware, Linux, networking, multithreading; occasionally covers DevOps tools like Jenkins, Nexus, Docker, ELK; shares practical tech insights and is dedicated to full‑stack Java development.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.