Databases 36 min read

Unlock MySQL Performance: Deep Dive into Query Optimization and Index Design

This article explains MySQL's three‑layer logical architecture, the end‑to‑end query execution flow, and the inner workings of the optimizer, cache, and storage engine, then offers practical schema, index, and specific‑query tuning techniques to boost database performance.

Code Ape Tech Column
Code Ape Tech Column
Code Ape Tech Column
Unlock MySQL Performance: Deep Dive into Query Optimization and Index Design

MySQL Logical Architecture

MySQL is organized into three layers: the client layer (connection handling, authentication, security), the server layer (SQL parsing, analysis, optimization, query cache, built‑in functions, stored procedures, triggers, views), and the storage‑engine layer (actual data storage and retrieval). The API between the server and storage engines abstracts engine‑specific details.

MySQL Query Process

When a client sends a query, MySQL first checks the query cache; if a hit occurs, the cached result is returned without parsing or planning. Otherwise the server parses the SQL, validates it, runs the optimizer to generate an execution plan, and finally the execution engine retrieves data via the storage‑engine handler API and streams results back to the client.

Client/Server Communication Protocol

The MySQL client/server protocol is half‑duplex: at any moment only one side transmits data. The client sends the whole query in a single packet, limited by max_allowed_packet. Large result sets are split into multiple packets, which is why using SELECT * and omitting LIMIT can cause excessive network traffic.

Query Cache

Before parsing, MySQL checks whether the query cache is enabled and whether the statement matches a cached entry (based on the query text, database, and protocol version). Cache entries are invalidated whenever any referenced table changes. Cache look‑ups and writes add overhead, so enabling the cache is only beneficial when the saved I/O outweighs the extra work.

Syntax Parsing and Preprocessing

The parser builds a parse tree, validates syntax rules, and the preprocessor checks object existence (tables, columns) and other semantic constraints.

Query Optimization

MySQL uses a cost‑based optimizer that estimates the cost of each possible execution plan using table statistics. The session variable last_query_cost shows the estimated cost after a query runs. Inaccurate statistics, user‑defined functions, or optimizer heuristics can lead to sub‑optimal plans.

mysql> select * from t_message limit 10;
mysql> show status like 'last_query_cost';
+-----------------+-------------+
| Variable_name   | Value       |
+-----------------+-------------+
| Last_query_cost| 6391.799000 |
+-----------------+-------------+

Execution Engine

After the optimizer produces a plan, the execution engine invokes the storage‑engine handler API for each table. Each table gets a handler instance that supplies column metadata, index information, and row data.

Result Return

Results are streamed back to the client row by row. Each row is sent as a packet; if the query cache is enabled and the query is cacheable, the result is also stored in the cache for future reuse.

Performance Optimization Recommendations

Schema and Data‑Type Design

Avoid unnecessary NULL columns; use NOT NULL when the column will be indexed.

Column width specifications such as INT(11) have no effect on storage or performance.

Prefer appropriate numeric types (e.g., INT, BIGINT) over DECIMAL when exact precision is not required; use TIMESTAMP (4 bytes) or DATETIME (8 bytes) wisely.

Keep the number of columns reasonable; many columns increase CPU due to row‑buffer conversion between server and storage layers.

Altering large tables is costly because MySQL creates a new table, copies data, then drops the old one; consider online schema‑change tools for massive tables.

High‑Performance Index Creation

Too many indexes increase disk usage and memory consumption; create indexes deliberately based on query patterns.

MySQL indexes are B‑Tree/B+Tree structures. Leaf pages store the indexed keys; internal pages store pointers to child pages.

B+Tree nodes are sized to a disk page, allowing a node to be read with a single I/O operation, which minimizes disk seeks.

When a leaf page becomes full, MySQL splits the page; rotations (similar to AVL rotations) can reduce the number of splits and thus I/O.

Specific Query Optimizations

COUNT() Optimization

COUNT(*)

counts rows directly and is usually faster than COUNT(col), which must skip NULL values. For an approximate row count, use EXPLAIN output, which does not execute the query.

JOIN Optimization

Ensure that the column used in the ON or USING clause of the second table in the join order has an index. Keep GROUP BY and ORDER BY expressions limited to a single table so the optimizer can use indexes.

LIMIT Pagination

Large offsets (e.g., LIMIT 10000, 20) force MySQL to read and discard many rows. Use keyset pagination ( WHERE id > last_id LIMIT 20) or a covering index to avoid the costly offset.

UNION Optimization

Prefer UNION ALL unless duplicate elimination is required; UNION forces a temporary table with a distinct check, which is expensive. Push WHERE, LIMIT, and ORDER BY into each subquery so the optimizer can apply indexes.

Conclusion

Understanding MySQL’s internal execution steps, the cost of each phase, and the behavior of the optimizer and cache enables developers to apply targeted tuning techniques—schema design, appropriate data types, well‑designed indexes, and query‑level adjustments—to achieve measurable performance gains.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

SQLindex designquery optimizationmysqlDatabase Performance
Code Ape Tech Column
Written by

Code Ape Tech Column

Former Ant Group P8 engineer, pure technologist, sharing full‑stack Java, job interview and career advice through a column. Site: java-family.cn

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.