Master MySQL: Essential Tips for Data Types, Indexes, and Query Optimization
This guide compiles essential MySQL 5.5 techniques for teams without a DBA, covering storage engine characteristics, data type selection, index design—including composite, prefix, and covering indexes—and practical query optimizations such as reducing scanned rows, avoiding redundant indexes, and improving execution plans.
The article summarizes common MySQL usage tips for teams without a DBA, based on MySQL 5.5 and using InnoDB unless otherwise noted.
MySQL Features
Understanding MySQL's features helps use it better. Unlike other databases, MySQL separates storage engines, each responsible for storing and retrieving data. Users can choose an engine that fits their workload or even develop a new one. The logical architecture is roughly as follows:
InnoDB is the default storage engine. Its main characteristics are:
Supports transactions
Supports row‑level locking
Data stored in tablespaces composed of multiple data files
Uses MVCC for high concurrency
Primary‑key‑based clustered index
Supports hot backup
Other common storage engines:
MyISAM – default in older versions, no transactions or row‑level lock, supports full‑text index, cannot recover safely after crash, supports compressed tables
Archive – only INSERT and SELECT, fast bulk inserts, full‑table scans for queries
CSV – treats a CSV file as a table
Memory – stores data in memory
Data Type Optimization
Principles for choosing data types:
Prefer types that occupy less space
Prefer simple types
Avoid unnecessary nullable columns
Smaller types save hardware resources (disk, memory, CPU). Use simple types; for example, use int instead of char because the latter involves character‑set sorting. Also, nullable columns consume extra storage and indexing overhead; set columns to NOT NULL when possible.
Integer Types
Integer types include:
tinyint
smallint
mediumint
int
bigint
They use 8, 16, 24, 32, and 64 bits respectively. Adding unsigned doubles the positive range but removes the ability to store negative numbers. Specifying a display width for integers has no practical effect.
Decimal Types
float
double
decimal float (32‑bit) and double (64‑bit) are the usual floating‑point types; specifying a display width is meaningless. decimal provides exact arithmetic but uses more space (e.g., decimal(18,9) stores 18 digits, 9 of which are fractional, using 9 bytes). For large datasets, consider using bigint with scaling instead of decimal.
String Types
varchar
char
varbinary
binary
blob
text
enum varchar stores the actual length plus 1 or 2 bytes for length metadata, making it space‑efficient for variable‑length strings. char uses fixed space and is suitable when values have similar length. varbinary and binary store binary strings; they are case‑sensitive and faster for comparisons because no character‑set conversion is needed. blob stores large binary data, text stores large textual data; both are kept off‑page with a pointer in the row and should not be indexed except on prefixes.
When a column has many repeated short values, using enum can save space because MySQL stores the enum as an integer internally.
Temporal Types
year
date
time
datetime
timestamp datetime ranges from 1001‑01‑01 to 9999‑12‑31, precise to seconds. timestamp stores seconds since 1970‑01‑01, ranges up to 2038, occupies 4 bytes (half the space of datetime), and updates automatically on INSERT/UPDATE for the first timestamp column.
Choosing Primary Key Types
Prefer integer primary keys (auto‑increment) because they are compact and efficient. Avoid GUIDs or MD5 hashes as primary keys; they are large, random, and cause index fragmentation.
Special Data Types
Store IP addresses as 32‑bit unsigned integers and use INET_ATON() / INET_NTOA() for conversion.
Index Optimization
InnoDB implements indexes with B+ trees. Example table:
CREATE TABLE `people` (
`Id` int(11) NOT NULL AUTO_INCREMENT,
`Name` varchar(5) NOT NULL,
`Age` tinyint(4) NOT NULL,
`Number` char(5) NOT NULL COMMENT '编号',
PRIMARY KEY (`Id`),
KEY `i_name_age_number` (`Name`,`Age`,`Number`)
) ENGINE=InnoDB AUTO_INCREMENT=14 DEFAULT CHARSET=utf8;Insertion example:
Index structure illustration:
The order of indexed columns matters: MySQL compares columns left‑to‑right. Queries should match the leftmost indexed columns first; otherwise later columns cannot be used.
Good examples:
SELECT * FROM people WHERE Name='Abel' AND Age=2 AND Number=12312
SELECT * FROM people WHERE Name='Abel'
SELECT * FROM people WHERE Name='Abel' LIKE 'Abel%'
SELECT * FROM people WHERE Name='Andy' AND Age BETWEEN 11 AND 20
SELECT * FROM people ORDER BY Name
SELECT * FROM people ORDER BY Name, Age
SELECT * FROM people GROUP BY Name
Bad examples:
SELECT * FROM people WHERE Age=2
SELECT * FROM people WHERE Name LIKE '%B'
SELECT * FROM people WHERE age=2
SELECT * FROM people WHERE Name='ABC' AND number=3
SELECT * FROM people WHERE Name LIKE 'B%' AND age=22
Hash Index Trick
If a column stores long strings (e.g., URLs), create a numeric hash column (e.g., URL_CRC) and index it:
SELECT * FROM t WHERE URL_CRC = 387695885 AND URL = 'www.baidu.com'To reduce hash collisions, use a custom hash function or a portion of MD5():
SELECT CONV(RIGHT(MD5('www.baidu.com'),16), 16, 10)Prefix Index
For long string columns, create a prefix index on the first few characters to reduce index size. Note that prefix indexes cannot be used for ORDER BY or GROUP BY operations.
Multi‑Column Index
Multi‑column indexes are often better than multiple single‑column indexes. Recommendations:
When querying multiple indexed columns together, create a composite index.
Place the most selective column first, unless query patterns dictate otherwise.
Example: SELECT * FROM people WHERE name='xxx' AND age=xx If age=20 is highly selective, putting age first may be more efficient.
Clustered Index
InnoDB stores the primary key's data in the leaf nodes of the clustered index. A table can have only one clustered index. If no primary key is defined, InnoDB creates a hidden clustered index.
Covering Index
When a query can be satisfied entirely from the index (covering index), the engine avoids accessing the table rows.
Duplicate and Redundant Indexes
Duplicate indexes (same columns, same order) provide no benefit and waste space. Redundant indexes (e.g., (A,B) and (A)) are unnecessary because the former can serve the latter's purpose.
Extending an existing index (adding columns) is preferable to creating a new one. However, in some cases a redundant index may be useful if the added column is long and degrades performance of the original index.
Unused Indexes
Indexes that are never used degrade INSERT/UPDATE/DELETE performance and should be removed.
Index Usage Summary
Three‑star principle for indexes:
Index should order rows in the same sequence as the WHERE clause (left‑to‑right).
Index order should match the query's ORDER BY clause.
Index should contain all columns needed by the query (covering index).
Indexes are not a cure‑all; maintaining them on massive tables can be costly, so consider partitioning or sharding.
Query Optimization
Reasons for Slow Queries
Requesting unnecessary rows
Fetching more rows than needed (e.g., requesting all rows when only ten are displayed) wastes resources.
Requesting unnecessary columns
Using SELECT * when only a few columns are needed adds overhead.
Repeatedly executing the same query
Cache results when possible to avoid duplicate work.
MySQL scanning extra records
Examine the execution plan; if many rows are scanned, add indexes, rewrite SQL, or redesign the schema (e.g., summary tables).
Refactoring Queries
Break a complex query into several simple queries.
Split large queries into smaller ones that each handle a portion of the work.
Decompose joins by fetching related tables separately and combining results in application code.
Miscellaneous
Optimizing count() SELECT COUNT(name LIKE 'B%') FROM people counts rows where the expression is true. Alternatives include using approximate row counts from the execution plan, covering index scans, summary tables, or caching row counts.
Optimizing joins
Ensure the column used for joining has an index.
Optimizing subqueries
For MySQL 5.5 and earlier, replace subqueries with joins when possible.
Optimizing GROUP BY / DISTINCT
Apply these operations on primary keys when feasible.
Optimizing LIMIT
Example of an inefficient LIMIT:
SELECT * FROM sa_stockinfo ORDER BY StockAcc LIMIT 400, 5MySQL scans 405 rows then discards the first 400. Using a covering index improves performance:
SELECT * FROM sa_stockinfo i JOIN (
SELECT StockInfoID FROM sa_stockinfo ORDER BY StockAcc LIMIT 400,5
) t ON i.StockInfoID = t.StockInfoIDOptimizing UNION
Prefer UNION ALL when duplicate elimination is not required.
Checking existence
Instead of SELECT COUNT(*) FROM t WHERE condition, use:
SELECT IFNULL((SELECT 1 FROM tableName WHERE condition LIMIT 1),0)References
《High Performance MySQL》
Author: 会长 Source: https://www.cnblogs.com/zzy0471/p/OptimizeMySQL.html
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Efficient Ops
This public account is maintained by Xiaotianguo and friends, regularly publishing widely-read original technical articles. We focus on operations transformation and accompany you throughout your operations career, growing together happily.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
