Master MySQL Performance: Practical Table, Index, and Connection Pool Optimizations
This article presents practical MySQL optimization techniques covering table schema design, character set choices, primary key strategies, field definitions, index creation, SQL query tuning, pagination tricks, join handling, connection‑pool parameter tuning, and historical data archiving for large‑scale databases.
Table Structure Optimization
When designing an application, the database schema heavily influences later performance, especially as user volume grows, making table structure optimization a critical step.
Character Set
Prefer UTF‑8 over GBK despite slightly larger storage; UTF‑8 supports all languages and avoids costly future migrations.
Primary Key
In InnoDB, the primary key serves as the clustered index. Use an auto‑increment integer primary key to keep inserts ordered, minimizing page splits and maximizing insert speed.
Field Recommendations
All indexed columns should be NOT NULL and have default values.
Avoid FLOAT / DOUBLE for precise decimals; use DECIMAL instead.
Do not store large text or binary data in TEXT / BLOB; keep such data in external storage and store only the file path.
Limit VARCHAR length to 8 KB.
Prefer DATETIME over TIMESTAMP for timestamps.
Add gmt_create and gmt_modified columns to track record creation and modification times.
Index Creation
Only add indexes to columns that are known to be used in queries; avoid blind index addition.
For InnoDB single‑column indexes, keep length ≤ 767 bytes (excess bytes become a prefix index).
For composite indexes, total length must stay ≤ 3072 bytes.
SQL Optimization
Basic Queries
Ensure queries use indexes; add missing indexes on frequently queried columns. Avoid SELECT *; specify only required columns to reduce I/O and memory usage.
Efficient Pagination
Using LIMIT m,n scans rows up to m+n, which becomes slow for large offsets. Replace with a subquery that selects the start ID:
SELECT id, name, age FROM A WHERE id >= (SELECT id FROM A LIMIT 100000,1) LIMIT 10Range Queries
Range conditions ( BETWEEN, >, <) and large IN lists may bypass indexes; place indexed columns first in the condition order.
Fuzzy LIKE Queries
Patterns like LIKE '%name%' cannot use indexes and cause full table scans; for large datasets consider a search engine or prepend an indexed prefix condition.
Multi‑Table Joins
Prefer joins over subqueries. MySQL uses a nested‑loop join; keep the driving table small and index join columns. Limit joins to three tables; otherwise split into multiple simpler queries. Increase join_buffer_size if necessary.
Connection Pool Optimization (DBCP)
initialSize : Initial connections created on first getConnection; set to average historical concurrency.
minIdle : Minimum idle connections to retain; typical value 5 (or 1 for very low load).
maxIdle : Maximum idle connections; set based on peak concurrency.
maxActive : Maximum active connections; set to the highest acceptable concurrent request count.
maxWait : Maximum time (e.g., 3 seconds) a request waits for a connection before failing.
minEvictableIdleTimeMillis : Time an idle connection can stay before eviction (default 30 minutes).
validationQuery : Simple SQL to test connection health.
testOnBorrow and testOnReturn : Disable; they add significant overhead.
testWhileIdle : Enable; background thread validates idle connections without hurting performance.
numTestsPerEvictionRun : Number of connections checked per eviction run; set equal to maxActive for thorough checking.
Pre‑warm Pool : Run a lightweight query at application startup to fill the pool before serving traffic.
Index Optimization Levels
When data grows beyond what query tuning can handle, introduce indexes:
Level 1 – Simple Indexes
Create single‑column indexes on WHERE conditions; use composite indexes following the left‑most prefix rule.
Level 2 – Order/Group Indexes
Index columns used in ORDER BY or GROUP BY to avoid extra sorting.
Level 3 – Covering Indexes
Include all selected columns in the index so MySQL can satisfy the query from the index alone, eliminating the need to read the table rows.
Index Selectivity
Prefer indexes on high‑selectivity columns (e.g., name) where the predicate returns few rows; low‑selectivity columns (e.g., gender) are less useful.
Historical Data Archiving
If a table grows by ~5 million rows per year and indexes no longer suffice, consider sharding; otherwise, archive data older than a certain period (e.g., six months) using a scheduled Quartz job that moves the data to an external HBase store while providing a query interface for occasional access.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
ITPUB
Official ITPUB account sharing technical insights, community news, and exciting events.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
