Databases 14 min read

MySQL Optimization: Concepts, Schema Design, Indexes, and Query Tuning

This article provides a comprehensive overview of MySQL fundamentals—including logical architecture, locks, transactions, and storage engines—and offers practical optimization strategies for schema design, data types, indexing, and query execution, supplemented with code examples and performance‑analysis tips.

Java Architect Essentials

Nov 15, 2021

MySQL Optimization: Concepts, Schema Design, Indexes, and Query Tuning

Preface

This article focuses on the relational database MySQL. For key‑value databases, refer to the linked article.

First, we briefly review MySQL basic concepts, then discuss optimization during creation and query phases.

1 Basic Concepts Overview

1.1 Logical Architecture

Client layer: connects to the server and sends SQL commands.

Server layer: parses, optimizes SQL, generates the execution plan, and executes it.

Storage engine: stores and retrieves data.

1.2 Locks

MySQL uses lock mechanisms to handle concurrency: shared (read) locks and exclusive (write) locks. Read locks are non‑blocking, allowing multiple clients to read the same resource simultaneously; write locks are exclusive and block other reads and writes. Briefly, optimistic locks are suitable for low‑contention scenarios using version numbers or timestamps, while pessimistic locks are used when contention is high and lock the data on each operation.

Optimistic lock : typically used when reads dominate writes; implemented via version numbers or timestamps.

Pessimistic lock : used in high‑contention environments; locks data for the duration of the operation.

Lock strategies include:

Table lock : locks the whole table, minimal overhead but increases lock contention.

Row lock : locks individual rows, higher overhead but maximizes concurrency.

InnoDB implements Multi‑Version Concurrency Control (MVCC), a variant of row locking that avoids actual lock operations in most cases by keeping snapshots.

1.3 Transactions

Transactions guarantee atomicity; they either fully succeed or fully roll back. MySQL uses autocommit by default, so each statement is a transaction unless an explicit transaction is started.

Isolation levels control visibility of changes:

Read Uncommitted : dirty reads possible.

Read Committed : only committed changes are visible; may cause non‑repeatable reads.

Repeatable Read : repeated reads of the same rows return the same data.

Serializable : highest isolation, forces serial execution.

1.4 Storage Engines

InnoDB – the most widely used engine, supports high‑throughput short‑lived transactions, crash recovery, and row‑level locking.

MyISAM – does not support transactions or row‑level locks; recovery after crash is unsafe.

2 Optimization During Table Creation

2.1 Schema and Data Type Optimization

Integers : TinyInt, SmallInt, MediumInt, Int, BigInt use 8‑, 16‑, 24‑, 32‑, 64‑bit storage respectively. Unsigned doubles the positive range.

Floating‑point numbers :

Float, Double – approximate calculations.

Decimal – exact fixed‑point numbers.

Strings :

VarChar – variable‑length, requires 1‑2 extra bytes for length.

Char – fixed‑length, suitable for values like MD5 hashes.

Blob / Text – for large binary or textual data.

Temporal types :

DateTime – 8‑byte wide range.

Timestamp – 4‑byte, stores UNIX epoch.

Optimization tips:

Choose the most appropriate type (e.g., store timestamps as integers, not strings).

Prefer smaller types (use TinyInt instead of Int when possible).

Use integer identifier columns rather than strings.

Avoid ORM‑generated schemas that default to large VarChar columns.

Balance normalization and denormalization based on read/write patterns.

Build summary or cache tables for heavy‑weight reporting.

During schema migrations, use shadow tables to preserve data while swapping names.

2.2 Indexes

Indexes consist of one or more column values; MySQL can only use the leftmost prefix of a composite index.

Benefits:

Reduce scanned rows.

Avoid sorting and temporary tables.

Convert random I/O to sequential I/O.

B‑Tree – most common index type; supports full key, range, and prefix lookups.

Limitations:

Query must start with the leftmost indexed column.

Skipping indexed columns invalidates the index.

Range conditions disable use of columns to the right of the range.

Hash index – works only for equality matches; cannot be used for sorting or partial matches.

Optimization advice:

Understand each index’s applicable scenarios and constraints.

Avoid indexing expressions or function arguments.

Use prefix indexes for long strings.

Combine multiple columns with AND/OR as needed.

Eliminate redundant indexes.

Place range conditions at the end of composite indexes.

Avoid overly long string indexes and nullable indexed columns.

3 Query‑time Optimization

3.1 Three Key Query Quality Metrics

Response time (service time + queue time).

Rows scanned.

Rows returned.

3.2 Query Optimization Points

Avoid SELECT *; fetch only needed columns.

Avoid scanning irrelevant rows.

Chunk large operations (e.g., delete 10,000 rows in batches).

Decompose joins into separate queries when appropriate.

Use COUNT(*) for total row count.

Group by primary key for efficient aggregation.

Apply late join filtering.

Optimize LIMIT pagination using covering indexes.

Prefer UNION ALL over UNION when duplicate removal is unnecessary.

SELECT id, NAME, age
FROM student s1
INNER JOIN (
    SELECT id
    FROM student
    ORDER BY age
    LIMIT 50,5
) AS s2 ON s1.id = s2.id;

Additional Tips

Data type mismatches cause implicit conversion, disabling index usage.

Leading % in LIKE prevents index usage.

MySQL 5.7 new features: generated columns and JSON support.

CREATE TABLE triangle (
    sidea DOUBLE,
    sideb DOUBLE,
    area DOUBLE AS (sidea * sideb / 2)
);
INSERT INTO triangle(sidea, sideb) VALUES (3, 4);
SELECT * FROM triangle;

+-------+-------+------+
| sidea | sideb | area |
+-------+-------+------+
|   3   |   4   |  6   |
+-------+-------+------+

CREATE TABLE json_test (name JSON);
INSERT INTO json_test VALUES ('{"name1": "value1", "name2": "value2"}');
SELECT * FROM json_test WHERE JSON_CONTAINS(name, '$.name1');

EXPLAIN Usage

EXPLAIN SELECT settleId FROM Settle WHERE settleId = "3679";

Explanation of EXPLAIN output fields: select_type , type , possible_keys , key , key_len , rows , extra , etc.

End of article. Readers are encouraged to share the content and join the community for further learning.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

mysql database optimization Indexes transactions schema design Query Tuning

Written by

Java Architect Essentials

Committed to sharing quality articles and tutorials to help Java programmers progress from junior to mid-level to senior architect. We curate high-quality learning resources, interview questions, videos, and projects from across the internet to help you systematically improve your Java architecture skills. Follow and reply '1024' to get Java programming resources. Learn together, grow together.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.