Essential MySQL Optimization Techniques for High-Performance Databases
This guide outlines practical MySQL optimization strategies—including table schema design, index creation, SQL tuning, connection‑pool configuration, and historical data archiving—to help developers maintain fast, scalable databases as data volume grows.
1. Table Structure Optimization
Designing the database schema early influences later performance, especially as user volume increases.
1.1 Character Set
Prefer UTF‑8 for universal language support; although GBK saves space for Chinese characters, migration costs are high, and storage can be expanded cheaply.
1.2 Primary Key
In InnoDB, the primary key serves as the clustered index. Use an auto‑increment integer so inserts follow the B+‑tree order, minimizing page splits and maximizing insert speed.
1.3 Field Recommendations
Fields with indexes should be NOT NULL and have a default value.
Avoid FLOAT / DOUBLE for precise decimals; use DECIMAL instead.
Do not store large text or binary data in TEXT / BLOB; keep such data in external storage and store only the file path.
Limit VARCHAR length to under 8 KB.
Prefer DATETIME over TIMESTAMP for timestamps to avoid null constraints and timezone issues.
Add gmt_create and gmt_modified columns for easier troubleshooting.
1.4 Index Creation
Only add indexes to fields you know will be queried.
InnoDB single‑column index length should not exceed 767 bytes; longer keys are truncated to a prefix.
Combined indexes must keep total length under 3072 bytes.
2. SQL Optimization
Common query types include CRUD, pagination, range queries, fuzzy searches, and multi‑table joins.
2.1 Basic Queries
Ensure queries use indexes; avoid SELECT * and retrieve only needed columns. For high‑traffic queries (e.g., >10 K calls/day), add appropriate indexes.
2.2 Efficient Pagination
Using LIMIT m,n forces MySQL to scan the first m+n rows. Replace it with a sub‑query that finds the start ID:
SELECT id, name, age FROM A WHERE id >= (SELECT id FROM A LIMIT 100000,1) LIMIT 10;2.3 Range Queries
Range conditions ( BETWEEN, >, <) and IN may or may not use indexes; keep index‑friendly conditions first.
2.4 Fuzzy Search (LIKE)
Patterns like LIKE '%name%' bypass indexes and cause full table scans. For large datasets, use a dedicated search engine or add a prefix condition that can use an index.
2.5 Multi‑Table Joins
Prefer JOIN over sub‑queries. Use the smallest result set as the driving table, index join columns, and limit the number of tables joined (ideally ≤3). If necessary, split a complex join into multiple simpler queries.
3. Connection‑Pool Optimization
Connection pools cache database connections to handle high concurrency. Below are key DBCP parameters and recommended settings.
3.1 initialSize
Initial number of connections created on first getConnection; set to the historical average concurrency.
3.2 minIdle
Minimum idle connections retained; typical value is 5 (or 1 for very low load).
3.3 maxIdle
Maximum idle connections; set according to peak concurrency (e.g., 20).
3.4 maxActive
Maximum active connections; match the highest acceptable concurrent request count (e.g., 100).
3.5 maxWait
Maximum time (ms) a request waits for a connection; a short value like 3000 ms helps fail fast under overload.
3.6 minEvictableIdleTimeMillis
Idle time before a connection is eligible for eviction; default is 30 minutes.
3.7 validationQuery
Simple SQL (e.g., SELECT 1) to test connection health.
3.8 testOnBorrow / testOnReturn
Testing on borrow/return adds overhead; generally disabled.
3.9 testWhileIdle
Enables periodic validation of idle connections; recommended to keep enabled.
3.10 numTestsPerEvictionRun
Number of connections examined per eviction run; set equal to maxActive for thorough checks.
3.11 Pre‑warming
Run a lightweight query at application startup to fill the pool before serving traffic.
4. Index Optimization
When data grows, SQL tuning alone may not suffice; proper indexing becomes critical.
4.1 Primary (Level‑1) Index
Index columns used in WHERE clauses; use single‑column indexes or composite indexes respecting the left‑most prefix rule.
4.2 Secondary (Level‑2) Index
Index columns involved in ORDER BY or GROUP BY to avoid extra sorting.
4.3 Tertiary (Level‑3) Index
Covering indexes that include all columns needed by a query, eliminating the need to read the base table.
4.4 Index Selectivity
Prefer high‑selectivity columns (few rows per distinct value) for indexes; low‑selectivity columns (e.g., gender) provide little benefit.
5. Historical Data Archiving
If a table reaches ~5 million rows per year, consider sharding; otherwise, archive data older than six months to a separate store (e.g., HBase) via a scheduled Quartz job, while still providing an API for occasional queries.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
ITPUB
Official ITPUB account sharing technical insights, community news, and exciting events.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
