Understanding MySQL Communication Protocols, Parsing, Optimizer, Storage Engines, and Execution Engine
This article explains how MySQL establishes connections, the supported communication protocols and message formats, the lexical and syntactic parsing process, query optimization and execution plan generation, the role of different storage engines, and how the execution engine uses the plan to operate on the storage layer.
1. Communication Protocol
Before a program can operate a database, it must first establish a connection to the database server, which typically runs a MySQL service listening on the default port 3306.
Two key aspects to understand when integrating with a third‑party system are:
The communication protocol (e.g., HTTP, WebService, TCP).
The message format (e.g., XML, JSON, fixed‑length) and the structure of the packet header.
MySQL supports multiple communication protocols and can work in synchronous or asynchronous modes, with either long or short connections.
Synchronous Communication
Synchronous communication blocks the client thread until the database returns a result, limiting concurrency to a one‑to‑one interaction.
Asynchronous Communication
Asynchronous communication avoids blocking but does not reduce the time spent executing SQL; each concurrent query typically requires its own connection, which can increase server load, so a connection pool is recommended.
Long and Short Connections
MySQL supports both short connections (closed immediately after use) and long connections (kept open to reduce connection overhead). Long connections are usually managed via a connection pool, but idle connections may be terminated by the server.
show global variables like 'wait_timeout'; -- non‑interactive timeout, e.g., JDBC
show global variables like 'interactive_timeout'; -- interactive timeout, e.g., client toolsThe default timeout is 28800 seconds (8 hours). You can also query status variables:
show status; Threads_cached: number of cached thread connections.
Threads_connected: current open connections.
Threads_created: threads created to handle connections.
Threads_running: threads not in sleep state (concurrent connections).Supported MySQL Communication Protocols
UnixSocket
On Linux, if no host is specified, MySQL connects via a Unix socket file (e.g., /var/lib/mysql/mysql.sock) without using the network stack.
select @@socket;TCP/IP
Most client libraries (e.g., mysql‑connector‑java) use TCP/IP to reach the MySQL server.
2. Syntax Parsing and Preprocessing
When an SQL statement is sent, MySQL's parser and preprocessor analyze the statement's lexical tokens, syntax, and semantics, producing a parse tree (select_lex).
Lexical Analysis
Lexical analysis breaks a full SQL statement into individual symbols. For example:
select name from user where id = 1;This statement is tokenized into eight symbols, each with a type and position.
Syntactic Analysis
The syntax analyzer checks for errors such as unmatched quotes and builds a parse tree according to MySQL's grammar.
Preprocessor
The preprocessor validates semantic aspects like the existence of tables and columns, producing a refined parse tree before execution.
3. Query Optimizer and Execution Plan
After parsing, the optimizer generates multiple possible execution plans and selects the one with the lowest estimated cost.
You can view the cost with:
show status like 'Last_query_cost';What the Optimizer Does
Examples include choosing the base table for multi‑table joins and selecting the best index when several are available.
Generating an Execution Plan
Enable optimizer tracing (note the performance overhead):
SHOW VARIABLES LIKE 'optimizer_trace';
SET optimizer_trace='enabled=on';Run a query, then inspect the trace:
SELECT * FROM information_schema.optimizer_trace\GThe trace JSON contains preparation, optimization, and execution phases, listing expanded queries and considered execution plans.
Viewing the Chosen Plan
Use EXPLAIN to see the plan MySQL intends to use (the actual plan may differ at runtime):
EXPLAIN SELECT name FROM user WHERE id=1;4. Storage Engine
Each MySQL table is stored using a specific storage engine, which determines the on‑disk structure.
Inspecting Storage Engines
Show the engine of existing tables:
SHOW TABLE STATUS FROM `xxx`;Common engines include MyISAM, InnoDB, Memory, CSV, and Archive.
Engine Comparison
MyISAM : table‑level locking, no transactions, fast inserts/selects, stores row count.
InnoDB : ACID‑compliant, row‑level locking, supports transactions and foreign keys, MVCC for concurrent reads, stores data in clustered indexes.
Memory : stores data in RAM for fast access, data lost on restart, suitable for temporary tables.
CSV : stores data as comma‑separated text files, no indexes, useful for import/export.
Archive : compact, unindexed tables for large, rarely accessed historical data.
Choosing an Engine
If strong data consistency and transactions are required, use InnoDB.
If read‑heavy workloads with minimal updates are needed, MyISAM may be suitable.
For temporary, fast‑access tables, use Memory.
5. Execution Engine
The execution engine consumes the chosen execution plan and invokes the appropriate storage‑engine APIs to perform the actual data operations.
Because all storage engines expose a common API, switching an engine does not require changes in application code.
After execution, results (or an empty set) are returned to the client.
Top Architecture Tech Stack
Sharing Java and Python tech insights, with occasional practical development tool tips.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.