Databases 14 min read

Essential MySQL Optimization Techniques for High-Performance Databases

This guide outlines practical MySQL optimization strategies—including table schema design, index creation, SQL tuning, connection‑pool configuration, and historical data archiving—to help developers maintain fast, scalable databases as data volume grows.

ITPUB

Apr 14, 2017

Essential MySQL Optimization Techniques for High-Performance Databases

1. Table Structure Optimization

Designing the database schema early influences later performance, especially as user volume increases.

1.1 Character Set

Prefer UTF‑8 for universal language support; although GBK saves space for Chinese characters, migration costs are high, and storage can be expanded cheaply.

1.2 Primary Key

In InnoDB, the primary key serves as the clustered index. Use an auto‑increment integer so inserts follow the B+‑tree order, minimizing page splits and maximizing insert speed.

1.3 Field Recommendations

Fields with indexes should be NOT NULL and have a default value.

Avoid FLOAT / DOUBLE for precise decimals; use DECIMAL instead.

Do not store large text or binary data in TEXT / BLOB; keep such data in external storage and store only the file path.

Limit VARCHAR length to under 8 KB.

Prefer DATETIME over TIMESTAMP for timestamps to avoid null constraints and timezone issues.

Add gmt_create and gmt_modified columns for easier troubleshooting.

1.4 Index Creation

Only add indexes to fields you know will be queried.

InnoDB single‑column index length should not exceed 767 bytes; longer keys are truncated to a prefix.

Combined indexes must keep total length under 3072 bytes.

2. SQL Optimization

Common query types include CRUD, pagination, range queries, fuzzy searches, and multi‑table joins.

2.1 Basic Queries

Ensure queries use indexes; avoid SELECT * and retrieve only needed columns. For high‑traffic queries (e.g., >10 K calls/day), add appropriate indexes.

2.2 Efficient Pagination

Using LIMIT m,n forces MySQL to scan the first m+n rows. Replace it with a sub‑query that finds the start ID:

SELECT id, name, age FROM A WHERE id >= (SELECT id FROM A LIMIT 100000,1) LIMIT 10;

2.3 Range Queries

Range conditions ( BETWEEN, >, <) and IN may or may not use indexes; keep index‑friendly conditions first.

2.4 Fuzzy Search (LIKE)

Patterns like LIKE '%name%' bypass indexes and cause full table scans. For large datasets, use a dedicated search engine or add a prefix condition that can use an index.

2.5 Multi‑Table Joins

Prefer JOIN over sub‑queries. Use the smallest result set as the driving table, index join columns, and limit the number of tables joined (ideally ≤3). If necessary, split a complex join into multiple simpler queries.

3. Connection‑Pool Optimization

Connection pools cache database connections to handle high concurrency. Below are key DBCP parameters and recommended settings.

3.1 initialSize

Initial number of connections created on first getConnection; set to the historical average concurrency.

3.2 minIdle

Minimum idle connections retained; typical value is 5 (or 1 for very low load).

3.3 maxIdle

Maximum idle connections; set according to peak concurrency (e.g., 20).

3.4 maxActive

Maximum active connections; match the highest acceptable concurrent request count (e.g., 100).

3.5 maxWait

Maximum time (ms) a request waits for a connection; a short value like 3000 ms helps fail fast under overload.

3.6 minEvictableIdleTimeMillis

Idle time before a connection is eligible for eviction; default is 30 minutes.

3.7 validationQuery

Simple SQL (e.g., SELECT 1) to test connection health.

3.8 testOnBorrow / testOnReturn

Testing on borrow/return adds overhead; generally disabled.

3.9 testWhileIdle

Enables periodic validation of idle connections; recommended to keep enabled.

3.10 numTestsPerEvictionRun

Number of connections examined per eviction run; set equal to maxActive for thorough checks.

3.11 Pre‑warming

Run a lightweight query at application startup to fill the pool before serving traffic.

4. Index Optimization

When data grows, SQL tuning alone may not suffice; proper indexing becomes critical.

4.1 Primary (Level‑1) Index

Index columns used in WHERE clauses; use single‑column indexes or composite indexes respecting the left‑most prefix rule.

4.2 Secondary (Level‑2) Index

Index columns involved in ORDER BY or GROUP BY to avoid extra sorting.

4.3 Tertiary (Level‑3) Index

Covering indexes that include all columns needed by a query, eliminating the need to read the base table.

4.4 Index Selectivity

Prefer high‑selectivity columns (few rows per distinct value) for indexes; low‑selectivity columns (e.g., gender) provide little benefit.

5. Historical Data Archiving

If a table reaches ~5 million rows per year, consider sharding; otherwise, archive data older than six months to a separate store (e.g., HBase) via a scheduled Quartz job, while still providing an API for occasional queries.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Indexing Connection Pool mysql database optimization SQL tuning

Written by

ITPUB

Official ITPUB account sharing technical insights, community news, and exciting events.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.