Boost MySQL Performance: Data Types, Indexes, and SQL Optimization Techniques
This article walks through practical MySQL performance tuning from a developer's perspective, covering a real‑world example, data‑type selection, index design, and SQL query refinements, while providing concrete commands, visual illustrations, and expert Q&A for deeper understanding.
MySQL often hits performance bottlenecks, requiring joint effort from DBAs and developers. While DBAs handle server‑level parameters, developers can improve performance by optimizing data types, indexes, and SQL statements. The following sections present a developer‑focused guide.
1. A Real‑World Example
An initial query needed to process 189,444 × 1,877 × 13,482 ≈ 479 billion rows, taking over half an hour. After adding appropriate indexes on join columns, the row count dropped to 368,006 × 1 × 3 × 1 ≈ 1.1 million, reducing execution time to seconds.
Key takeaway: a single poorly designed SQL can cause severe service degradation.
Developers should master execution plan inspection and profiling tools:
EXPLAIN SELECT … EXPLAIN EXTENDED SELECT …Use the profiling tool:
SET profiling = 1;</code>
<code>SHOW PROFILES;</code>
<code>SHOW PROFILE;</code>
<code>SHOW PROFILE ALL FOR QUERY 61;2. Data‑Type Optimization
Selection Steps
Step 1: Identify the broad category (numeric, string, datetime, etc.).
Step 2: Choose the specific type based on storage length, range, precision, and special behavior.
General Principles
Use the smallest simple type that fits the data.
For variable‑length VARCHAR, allocate only the needed space.
Use ENUM cautiously.
Prefer integer identifiers.
Store related columns with the same type, especially for join conditions.
Practical Cases
tinyint unsigned for values 0‑200 saves disk, memory, and CPU cycles.
Store IP addresses as INT instead of varchar(15) to avoid costly string comparisons.
Use native date/time types (date, time, datetime) rather than strings.
Store MD5 password hashes in CHAR because they have fixed length.
3. Index Optimization
Index Types
Clustered index: data rows are stored in the leaf pages of the index.
Secondary (non‑clustered) index: leaf nodes contain primary‑key references.
Data‑Structure Variants
B‑Tree
Hash
R‑Tree (spatial)
Full‑text
Index Strategy
Create indexes on columns frequently used in JOINs.
Index columns that appear often in WHERE clauses, especially on large tables.
Avoid excessive indexes on tables with heavy INSERT/UPDATE/DELETE workloads.
Prefer high‑cardinality columns; check cardinality with SHOW INDEX.
Index small columns; for long text/varchar use prefix indexes, e.g., CREATE INDEX idx_name ON tbl(col(10)); Remove duplicate or redundant indexes.
Design composite indexes with the most selective column first.
Prefix Indexes
They index the leading part of a column to save space, but may reduce selectivity. For BLOB/TEXT columns, a prefix length is mandatory.
When the prefix length reaches about 7 characters, further increase yields little selectivity gain.
Duplicate and Redundant Indexes
Duplicate indexes repeat the same columns in the same order, causing extra maintenance overhead. Redundant indexes are prefixes of existing composite indexes and should be removed.
Covering Indexes
An index that contains all columns needed by a query allows MySQL to satisfy the query using only the index (visible as “Using index” in the EXPLAIN extra column), reducing I/O.
4. SQL Optimization
WHERE Clause
Use independent columns in conditions; otherwise indexes are ignored.
JOIN Optimization
Ensure indexed columns in ON/USING clauses, typically on the second table.
Keep join column data types consistent.
LIKE Optimization
If the pattern starts with a fixed string (e.g., WHERE name LIKE 'MA%'), MySQL can use the index. Patterns beginning with a wildcard (e.g., '%MA%') cannot use the index.
SELECT * FROM customer WHERE last_name LIKE 'MA%';</code>
<code>SELECT * FROM customer WHERE last_name LIKE '%MA%';Select Specific Columns
Avoid SELECT * when unnecessary; it consumes more resources and may prevent use of covering indexes.
GROUP BY Optimization
Grouping by a primary‑key column (e.g., GROUP BY actor_id) is faster than grouping by non‑key columns.
SELECT actor.first_name, actor.last_name, COUNT(*)
FROM film_actor
JOIN actor USING (actor_id)
GROUP BY actor.actor_id;UNION Optimization
Prefer UNION ALL unless duplicate elimination is required, as UNION forces a costly distinct operation.
Duplicate WHERE/ORDER BY/LIMIT clauses inside each subquery to reduce temporary‑table processing.
5. Q&A
Q1: Is the shown product a Cartesian product?
A1: It is a product, but not a Cartesian product; the Cartesian product multiplies total row counts, while this is a nested product.
Q2: At what table size should MySQL consider sharding?
A2: When a single table reaches hundreds of millions of rows, performance degrades; consider offloading hot data to NoSQL.
Q3: Why do some SELECTs take thousands of seconds online but only a second via direct connection?
A3: Check max_allowed_packet, possible blocking queries, and monitor CPU, I/O, and memory bandwidth.
Q4: How does FLUSH TABLES work and what other operations trigger an implicit flush?
A4: FLUSH TABLES acquires a shared lock; backup tools use it to ensure data consistency. Other operations that lock tables may also cause implicit flushes.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
dbaplus Community
Enterprise-level professional community for Database, BigData, and AIOps. Daily original articles, weekly online tech talks, monthly offline salons, and quarterly XCOPS&DAMS conferences—delivered by industry experts.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
