Mastering InnoDB: Table & Index Design Strategies for High‑Performance MySQL
This article explains InnoDB's record storage mechanics, compares auto‑increment and business primary keys, details primary, secondary, and composite index structures, and shares practical table‑design guidelines and real‑world case studies from ZhaiZhai's MySQL usage.
Introduction
ZhaiZhai uses MySQL with the InnoDB engine for its backend storage. Based on the characteristics of both the business and MySQL, the article demonstrates how to design tables and indexes for specific scenarios.
Basic Knowledge
InnoDB Record Storage
InnoDB stores records in primary‑key order, creating a clustered primary‑key index. Records are organized by page, the basic unit of disk‑memory exchange.
Page‑level order : Pages are linked by a doubly‑linked list, often logically ordered.
Within‑page order : Records on a page are linked by a singly‑linked list, enabling near‑binary‑search efficiency via a slot structure.
Different primary‑key choices affect storage:
Auto‑increment primary key : Values increase monotonically, leading to sequential inserts, high disk‑space utilization, low random I/O, and excellent insert performance.
Business primary key : Meaningful keys (e.g., uid, infoId) may cause page splits and fragmentation, reducing write performance and disk utilization.
However, conclusions about primary‑key choice must consider index behavior, which is discussed later.
Primary‑Key Index
InnoDB automatically creates a clustered B+Tree index on the primary key; the leaf nodes store the actual records.
Secondary Index
Non‑primary indexes (secondary indexes) first locate the primary‑key value, then use the clustered index to fetch the row. Hence they are also called “secondary indexes”.
Characteristics of secondary indexes:
All indexes except the primary key.
Leaf nodes store the primary‑key value.
A query must traverse both the secondary index and the primary‑key index.
Composite (Multi‑Column) Index
Composite indexes contain multiple columns; sorting compares the first column, then the second, and so on.
Usage rules:
The leftmost prefix must be used; otherwise the index cannot be applied.
Skipping a middle column prevents later columns from being used.
If a column uses a range condition, columns to its right cannot be used.
For a composite index (a,b,c), queries can use (a), (a,b), or (a,b,c).
Summary of Primary‑Key Choices
Auto‑increment keys offer high write/read efficiency and disk utilization but require two‑level index lookups because most queries do not directly use the primary key.
Business keys have lower write/read efficiency and disk utilization but can be satisfied with a single‑level index when covered indexes are employed.
In ZhaiZhai’s production, business primary keys are preferred despite the SSD‑level I/O advantage of auto‑increment keys being negligible.
Table Design Principles at ZhaiZhai
Primary‑key selection : Business primary keys are used for all online services because they better match read‑heavy workloads.
Number of indexes : No more than five indexes per table to avoid oversized index files.
Column type selection : Prefer small, simple types (e.g., BOOL → TINYINT, enums → TINYINT, monetary values stored as LONG after converting decimals to integers).
Partitioning strategy : Keep table size around tens of millions; use either key‑modulo sharding for uniform load or time‑based sharding for hot/cold data separation.
Case Study 1 – User Table
Fields: uid, nickname, mobile, addr, image…, switch. uid is the primary key; an index on mobile supports the second query pattern.
The switch column is a BIGINT used as a bitmask for up to 64 boolean attributes. High query frequency on this column can be optimized by creating a composite index on (uid, switch), allowing the query to be satisfied entirely from the secondary index without accessing the clustered record.
Case Study 2 – IM Subsystem Partitioning
The IM subsystem has four main tables. Except for system messages, other tables are sharded by uid modulo 128. System messages have a 30‑day TTL, so they are partitioned monthly, each month further split into 128 tables.
To avoid two‑round queries across month partitions when fetching recent messages, ZhaiZhai writes each message to both the current and previous month tables, enabling a single‑month read that covers the 30‑day window.
Key Takeaways
Auto‑increment primary keys are not universally superior; choose based on business query patterns.
Prefer simple, small column types for better performance and storage efficiency.
Limit the number of indexes; excessive indexes inflate index files and degrade write performance.
When a query can be satisfied from an index alone, the storage engine avoids accessing the clustered record, greatly improving speed.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
21CTO
21CTO (21CTO.com) offers developers community, training, and services, making it your go‑to learning and service platform.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
