Databases 36 min read

Unlock MySQL Performance: Deep Dive into Query Execution, Indexes, and Optimization Strategies

This article explains MySQL's logical architecture, query processing steps, client/server protocol, query cache behavior, parsing, optimizer cost model, execution engine, result delivery, and provides practical performance‑tuning advice on schema design, index creation, and specific query patterns.

dbaplus Community
dbaplus Community
dbaplus Community
Unlock MySQL Performance: Deep Dive into Query Execution, Indexes, and Optimization Strategies

Introduction

The article revisits common MySQL tuning tips (e.g., avoiding SELECT *, proper data types, indexing) and stresses the importance of understanding the underlying mechanisms to decide when those tips truly improve performance.

MySQL Logical Architecture

MySQL is divided into three logical layers:

Client layer – handles connection, authentication, and security.

Server layer – parses, analyzes, optimizes, caches queries, and provides built‑in functions; it also implements cross‑engine features such as stored procedures, triggers, and views.

Storage‑engine layer – responsible for actual data storage and retrieval; the server communicates with engines via a stable API that abstracts engine differences.

Query Process

When a client sends a query, MySQL follows a six‑step pipeline: receive request, check query cache, parse and preprocess, generate an execution plan via the optimizer, execute the plan using storage‑engine handler APIs, and finally return results (optionally caching them).

Client/Server Communication Protocol

The protocol is half‑duplex: at any moment only one side transmits data. Large queries require increasing max_allowed_packet; the server may reject overly large packets. Responses are sent in one or more packets, and the client must read the entire result set before the server can send more data, which is why keeping result sets small (e.g., avoiding SELECT * and using LIMIT) is recommended.

Query Cache

If enabled, MySQL first checks whether the exact query string (including whitespace and comments) matches a cached entry. The cache is a hash‑based structure keyed by the query text, database, and protocol version. Queries involving user‑defined functions, temporary tables, or system tables are never cached. Cache entries are invalidated whenever any referenced table changes, which adds overhead on writes and even on reads for large or fragmented caches.

Every query is examined for a cache hit, even if it will never be cached.

If a result is cached, storing it after execution also incurs cost.

Therefore, enabling the cache only helps when the saved I/O outweighs these extra costs; otherwise it may degrade performance, especially for write‑intensive workloads.

Syntax Parsing and Pre‑processing

The parser builds a syntax tree from the SQL text, validates keywords and order, and the pre‑processor checks object existence (tables, columns) and other MySQL‑specific rules.

Query Optimization

The optimizer transforms the validated syntax tree into one or more execution plans and selects the plan with the lowest estimated cost using a cost‑based model. The current session variable last_query_cost shows the estimated number of page reads.

mysql> select * from t_message limit 10;
mysql> show status like 'last_query_cost';
+-----------------+-------------+
| Variable_name   | Value       |
| Last_query_cost | 6391.799000|
+-----------------+-------------+

Factors that can cause a sub‑optimal plan include inaccurate statistics, user‑defined functions, or storage‑engine‑specific costs. The optimizer employs many strategies, such as reordering joins, optimizing MIN()/MAX(), early termination with LIMIT, and more efficient sorting.

Execution Engine

After the plan is chosen, MySQL invokes the handler API of the relevant storage engine for each table. Each table is represented by a handler instance that provides column metadata, index statistics, and row access methods. Although the handler API set is small, it is used extensively to fetch rows, evaluate expressions, and manage locks.

Result Return

The final stage streams rows back to the client packet by packet. Even if a query returns no rows, MySQL still sends metadata such as affected‑row count and execution time. When the query cache is active and the result is cacheable, the result is stored for future reuse.

Performance Tips

The article groups practical advice into three areas.

1. Schema and Data‑Type Design

Prefer small, simple types (e.g., INT over VARCHAR for IP addresses, DATETIME for timestamps).

NULL vs NOT NULL matters mainly for index creation.

INT(M) display width has no effect on storage.

UNSIGNED doubles the positive range.

Avoid DECIMAL for most cases; use BIGINT with scaling when exact precision is needed.

TIMESTAMP uses 4 bytes and is limited to 1970‑2038; DATETIME uses 8 bytes and is timezone‑agnostic.

Limit the number of columns to reduce row‑buffer copying overhead.

Large tables should be altered by creating a new table and swapping, not by in‑place ALTER.

2. High‑Performance Index Creation

Too many indexes increase disk and memory usage.

Plan indexes before writing code; adding them later requires extensive monitoring.

Understand B‑Tree/B+Tree fundamentals: leaf pages store keys, internal pages store pointers, and node size is aligned to the OS page size to minimize I/O.

Index splits and rotations are expensive; keep node fill factors balanced.

3. Specific Query Optimizations

COUNT() – use COUNT(*) for row counts; it is faster than counting a specific column.

JOINs – MySQL uses nested‑loop joins; ensure the second table’s join column is indexed.

LIMIT pagination – large offsets cause MySQL to read and discard many rows; use covering indexes or “keyset pagination” (e.g., WHERE id > last_id LIMIT n).

UNION – prefers UNION ALL unless duplicate elimination is required; push down WHERE/LIMIT/ORDER BY into subqueries to let the optimizer use indexes.

Conclusion

Understanding MySQL’s internal execution flow, cost model, and storage structures enables developers to evaluate whether a given optimization truly benefits their workload, choose appropriate data types, design effective indexes, and apply query‑level tweaks such as covering indexes, keyset pagination, and proper use of UNION.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

query optimizationperformance tuningDatabase Architecturemysqlindexes
dbaplus Community
Written by

dbaplus Community

Enterprise-level professional community for Database, BigData, and AIOps. Daily original articles, weekly online tech talks, monthly offline salons, and quarterly XCOPS&DAMS conferences—delivered by industry experts.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.