Databases 22 min read

Master MySQL: Essential Tips for Data Types, Indexes, and Query Optimization

This guide compiles essential MySQL 5.5 techniques for teams without a DBA, covering storage engine characteristics, data type selection, index design—including composite, prefix, and covering indexes—and practical query optimizations such as reducing scanned rows, avoiding redundant indexes, and improving execution plans.

Efficient Ops

May 7, 2019

Master MySQL: Essential Tips for Data Types, Indexes, and Query Optimization

The article summarizes common MySQL usage tips for teams without a DBA, based on MySQL 5.5 and using InnoDB unless otherwise noted.

MySQL Features

Understanding MySQL's features helps use it better. Unlike other databases, MySQL separates storage engines, each responsible for storing and retrieving data. Users can choose an engine that fits their workload or even develop a new one. The logical architecture is roughly as follows:

InnoDB is the default storage engine. Its main characteristics are:

Supports transactions

Supports row‑level locking

Data stored in tablespaces composed of multiple data files

Uses MVCC for high concurrency

Primary‑key‑based clustered index

Supports hot backup

Other common storage engines:

MyISAM – default in older versions, no transactions or row‑level lock, supports full‑text index, cannot recover safely after crash, supports compressed tables

Archive – only INSERT and SELECT, fast bulk inserts, full‑table scans for queries

CSV – treats a CSV file as a table

Memory – stores data in memory

Data Type Optimization

Principles for choosing data types:

Prefer types that occupy less space

Prefer simple types

Avoid unnecessary nullable columns

Smaller types save hardware resources (disk, memory, CPU). Use simple types; for example, use int instead of char because the latter involves character‑set sorting. Also, nullable columns consume extra storage and indexing overhead; set columns to NOT NULL when possible.

Integer Types

Integer types include:

tinyint

smallint

mediumint

int

bigint

They use 8, 16, 24, 32, and 64 bits respectively. Adding unsigned doubles the positive range but removes the ability to store negative numbers. Specifying a display width for integers has no practical effect.

Decimal Types

float

double

decimal float (32‑bit) and double (64‑bit) are the usual floating‑point types; specifying a display width is meaningless. decimal provides exact arithmetic but uses more space (e.g., decimal(18,9) stores 18 digits, 9 of which are fractional, using 9 bytes). For large datasets, consider using bigint with scaling instead of decimal.

String Types

varchar

char

varbinary

binary

blob

text

enum varchar stores the actual length plus 1 or 2 bytes for length metadata, making it space‑efficient for variable‑length strings. char uses fixed space and is suitable when values have similar length. varbinary and binary store binary strings; they are case‑sensitive and faster for comparisons because no character‑set conversion is needed. blob stores large binary data, text stores large textual data; both are kept off‑page with a pointer in the row and should not be indexed except on prefixes.

When a column has many repeated short values, using enum can save space because MySQL stores the enum as an integer internally.

Temporal Types

year

date

time

datetime

timestamp datetime ranges from 1001‑01‑01 to 9999‑12‑31, precise to seconds. timestamp stores seconds since 1970‑01‑01, ranges up to 2038, occupies 4 bytes (half the space of datetime), and updates automatically on INSERT/UPDATE for the first timestamp column.

Choosing Primary Key Types

Prefer integer primary keys (auto‑increment) because they are compact and efficient. Avoid GUIDs or MD5 hashes as primary keys; they are large, random, and cause index fragmentation.

Special Data Types

Store IP addresses as 32‑bit unsigned integers and use INET_ATON() / INET_NTOA() for conversion.

Index Optimization

InnoDB implements indexes with B+ trees. Example table:

CREATE TABLE `people` (
  `Id` int(11) NOT NULL AUTO_INCREMENT,
  `Name` varchar(5) NOT NULL,
  `Age` tinyint(4) NOT NULL,
  `Number` char(5) NOT NULL COMMENT '编号',
  PRIMARY KEY (`Id`),
  KEY `i_name_age_number` (`Name`,`Age`,`Number`)
) ENGINE=InnoDB AUTO_INCREMENT=14 DEFAULT CHARSET=utf8;

Insertion example:

Index structure illustration:

The order of indexed columns matters: MySQL compares columns left‑to‑right. Queries should match the leftmost indexed columns first; otherwise later columns cannot be used.

Good examples:

SELECT * FROM people WHERE Name='Abel' AND Age=2 AND Number=12312

SELECT * FROM people WHERE Name='Abel'

SELECT * FROM people WHERE Name='Abel' LIKE 'Abel%'

SELECT * FROM people WHERE Name='Andy' AND Age BETWEEN 11 AND 20

SELECT * FROM people ORDER BY Name

SELECT * FROM people ORDER BY Name, Age

SELECT * FROM people GROUP BY Name

Bad examples:

SELECT * FROM people WHERE Age=2

SELECT * FROM people WHERE Name LIKE '%B'

SELECT * FROM people WHERE age=2

SELECT * FROM people WHERE Name='ABC' AND number=3

SELECT * FROM people WHERE Name LIKE 'B%' AND age=22

Hash Index Trick

If a column stores long strings (e.g., URLs), create a numeric hash column (e.g., URL_CRC) and index it:

SELECT * FROM t WHERE URL_CRC = 387695885 AND URL = 'www.baidu.com'

To reduce hash collisions, use a custom hash function or a portion of MD5():

SELECT CONV(RIGHT(MD5('www.baidu.com'),16), 16, 10)

Prefix Index

For long string columns, create a prefix index on the first few characters to reduce index size. Note that prefix indexes cannot be used for ORDER BY or GROUP BY operations.

Multi‑Column Index

Multi‑column indexes are often better than multiple single‑column indexes. Recommendations:

When querying multiple indexed columns together, create a composite index.

Place the most selective column first, unless query patterns dictate otherwise.

Example: SELECT * FROM people WHERE name='xxx' AND age=xx If age=20 is highly selective, putting age first may be more efficient.

Clustered Index

InnoDB stores the primary key's data in the leaf nodes of the clustered index. A table can have only one clustered index. If no primary key is defined, InnoDB creates a hidden clustered index.

Covering Index

When a query can be satisfied entirely from the index (covering index), the engine avoids accessing the table rows.

Duplicate and Redundant Indexes

Duplicate indexes (same columns, same order) provide no benefit and waste space. Redundant indexes (e.g., (A,B) and (A)) are unnecessary because the former can serve the latter's purpose.

Extending an existing index (adding columns) is preferable to creating a new one. However, in some cases a redundant index may be useful if the added column is long and degrades performance of the original index.

Unused Indexes

Indexes that are never used degrade INSERT/UPDATE/DELETE performance and should be removed.

Index Usage Summary

Three‑star principle for indexes:

Index should order rows in the same sequence as the WHERE clause (left‑to‑right).

Index order should match the query's ORDER BY clause.

Index should contain all columns needed by the query (covering index).

Indexes are not a cure‑all; maintaining them on massive tables can be costly, so consider partitioning or sharding.

Query Optimization

Reasons for Slow Queries

Requesting unnecessary rows

Fetching more rows than needed (e.g., requesting all rows when only ten are displayed) wastes resources.

Requesting unnecessary columns

Using SELECT * when only a few columns are needed adds overhead.

Repeatedly executing the same query

Cache results when possible to avoid duplicate work.

MySQL scanning extra records

Examine the execution plan; if many rows are scanned, add indexes, rewrite SQL, or redesign the schema (e.g., summary tables).

Refactoring Queries

Break a complex query into several simple queries.

Split large queries into smaller ones that each handle a portion of the work.

Decompose joins by fetching related tables separately and combining results in application code.

Miscellaneous

Optimizing count() SELECT COUNT(name LIKE 'B%') FROM people counts rows where the expression is true. Alternatives include using approximate row counts from the execution plan, covering index scans, summary tables, or caching row counts.

Optimizing joins

Ensure the column used for joining has an index.

Optimizing subqueries

For MySQL 5.5 and earlier, replace subqueries with joins when possible.

Optimizing GROUP BY / DISTINCT

Apply these operations on primary keys when feasible.

Optimizing LIMIT

Example of an inefficient LIMIT:

SELECT * FROM sa_stockinfo ORDER BY StockAcc LIMIT 400, 5

MySQL scans 405 rows then discards the first 400. Using a covering index improves performance:

SELECT * FROM sa_stockinfo i JOIN (
  SELECT StockInfoID FROM sa_stockinfo ORDER BY StockAcc LIMIT 400,5
) t ON i.StockInfoID = t.StockInfoID

Optimizing UNION

Prefer UNION ALL when duplicate elimination is not required.

Checking existence

Instead of SELECT COUNT(*) FROM t WHERE condition, use:

SELECT IFNULL((SELECT 1 FROM tableName WHERE condition LIMIT 1),0)

References

《High Performance MySQL》

Author: 会长 Source: https://www.cnblogs.com/zzy0471/p/OptimizeMySQL.html

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

MySQL database optimization Indexes Data Types SQL Performance

Written by

Efficient Ops

This public account is maintained by Xiaotianguo and friends, regularly publishing widely-read original technical articles. We focus on operations transformation and accompany you throughout your operations career, growing together happily.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.