Comprehensive Overview of MySQL Architecture, Logs, Indexes, Transactions, Locks, MVCC, Buffer Pool, and Optimization
This article provides an in‑depth guide to MySQL internals, covering the SQL execution process, server and storage engine layers, binlog/redo/undo logs, index structures, transaction isolation levels, lock types, MVCC implementation, buffer pool management, and practical optimization techniques.
This article explains the internal architecture of MySQL, starting with the Server layer (connector, query cache, parser, optimizer, executor) and the Storage Engine layer, which uses a plug‑in model and defaults to InnoDB.
It then describes the three MySQL binary logs: BinLog (records DDL and DML for replication), RedoLog (write‑ahead logging for crash‑safety), and UndoLog (used for transaction rollback and snapshot reads). The BinLog modes—STATEMENT, ROW, and MIXED—are compared with their advantages and disadvantages, and the asynchronous master‑slave replication workflow is detailed.
The article covers index fundamentals, including hash tables, ordered arrays, and B+ trees, and explains why B+ trees are preferred in InnoDB. It lists index types (primary, unique, secondary), design principles, and common scenarios that cause index invalidation such as leading‑wildcard searches and implicit type conversion.
Transaction isolation is discussed through the ACID properties, the four isolation levels (READ UNCOMMITTED, READ COMMITTED, REPEATABLE READ, SERIALIZABLE), and the phenomena they prevent (dirty read, non‑repeatable read, phantom read). The default MySQL isolation (RR) and its effect on phantom reads are explained.
Locking mechanisms are examined, contrasting pessimistic and optimistic locks, and detailing MySQL lock granularity: table‑level, page‑level, and row‑level locks. InnoDB lock types—including shared, exclusive, intention, gap, next‑key, and auto‑inc locks—are listed with brief usage notes.
MVCC (Multi‑Version Concurrency Control) is introduced, describing the four hidden fields ( DB_TRX_ID, DB_ROLL_PTR, DB_ROW_ID, FLAG), the role of undo logs, and how a consistent read view is built to enable snapshot reads without blocking.
The buffer pool is explained, highlighting its division into old and new generations, LRU replacement, and issues such as read‑ahead waste and pool pollution, with mitigation strategies.
Table “slimming” techniques are presented, noting that DELETE leaves reusable space (holes) and showing how to rebuild a table or use ALTER TABLE … ENGINE=InnoDB or tools like gh‑ost to reclaim space.
Additional topics include the seven join types, count performance tips, random row selection methods, and the performance differences between EXISTS and IN subqueries.
MySQL optimization is broken into four pillars: SQL & indexes, schema design, system configuration, and hardware. Practical advice covers using covering indexes, avoiding full‑table scans, minimizing data transfer, batching DML, and reducing CPU‑intensive operations such as sorting.
Higher‑level strategies such as read/write splitting (application‑level or middleware), vertical and horizontal sharding, and the use of distributed ID generators are outlined.
The article concludes with an introduction to TiDB, a MySQL‑compatible distributed database, describing its strengths (horizontal scalability, strong consistency, MySQL protocol compatibility) and scenarios where it is appropriate or unnecessary.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Full-Stack Internet Architecture
Introducing full-stack Internet architecture technologies centered on Java
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
