Databases 13 min read

MySQL Index Optimization and Code Review Practices

This article explains MySQL index fundamentals, the B+Tree structure, various index types, the concept of covering indexes to avoid row look‑ups, the left‑most prefix rule, and presents three real‑world optimization cases that illustrate how to design effective composite indexes, reduce index size, and improve query performance while avoiding costly table scans and sorts.

Wukong Talks Architecture

Oct 25, 2021

MySQL Index

MySQL indexes are data structures that accelerate query speed, similar to a book's table of contents. Internally they are implemented using B+Tree structures, where each data page is 16 KB, allowing billions of rows to be accessed with only 1–3 disk I/O operations.

Other index structures such as hash, ordered arrays, and binary trees are discussed, explaining why they are unsuitable for MySQL indexing.

Types of Indexes

Indexes can be classified by data structure (B+Tree, hash, R‑Tree, FULLTEXT), by physical storage (clustered and non‑clustered), and by logical role (primary, ordinary, unique, composite, spatial).

What Is a “Covering” (Back‑Table) Lookup?

In InnoDB, the primary key index stores the full row data, while secondary indexes store the indexed columns plus the primary key value. A covering index contains all columns required by a query, eliminating the need to fetch the row from the primary index (i.e., avoiding a back‑table lookup).

Solving Back‑Table Lookups

Creating a composite index that includes all queried columns (e.g.,

CREATE TABLE user (id INT PRIMARY KEY, name VARCHAR(20), sex VARCHAR(5), INDEX(name, sex));

) allows the query to be satisfied directly from the secondary index, removing the extra lookup.

Left‑Most Prefix Principle

The left‑most prefix rule states that a composite index can be used for queries that reference the leftmost N columns of the index, or for string indexes the leftmost M characters. Examples illustrate how ordering of columns affects index usage and when a pattern like LIKE '%se' disables the index.

Optimization Case 1

A query filtering by STATUS = '1' AND shop_code = 'XXX' used an index idx_status_shop_code. The recommendation was to place the high‑cardinality shop_code first (e.g., idx_shop_code_status_act_id) and include act_id and store_code to avoid both back‑table lookups and filesort operations.

Optimization Case 2

For encrypted phone numbers stored as MD5, a composite index on mobile_md5 and type was created. By evaluating distinct prefix cardinality with

SELECT COUNT(DISTINCT LEFT(mobile_md5,5))/COUNT(*) FROM XXX.users

, it was shown that indexing only the first five characters provides ~80% selectivity, saving space.

Optimization Case 3

When dealing with 20‑character user IDs that share a common prefix, the suggestion is to index the reversed string’s leading characters (e.g., LEFT(REVERSE(userId),7)) to capture the high‑cardinality suffix. Additionally, reordering WHERE conditions to place high‑selectivity columns first and adding city_code to the composite index eliminates back‑table lookups.

Additional Code Review Findings

Common issues include violating the single‑responsibility principle by using generic APIs that return *, leading to unnecessary data retrieval and back‑table lookups. Design patterns such as Strategy and Builder were recommended to improve extensibility.

Heavy use of Java 8 streams was noted; while streams can be parallelized for performance gains, they may reduce readability. An example demonstrates measuring single‑threaded versus parallel stream execution time.

public class StreamParallelDemo {
    public static void main(String[] args) {
        System.out.println(String.format("本计算机的核数：%d", Runtime.getRuntime().availableProcessors()));

        // 产生100w个随机数(1 ~ 100)，组成列表
        Random random = new Random();
        List<Integer> list = new ArrayList<>(1000_0000);

        for (int i = 0; i < 1000_0000; i++) {
            list.add(random.nextInt(100));
        }

        long prevTime = getCurrentTime();
        list.stream().reduce((a, b) -> a + b).ifPresent(System.out::println);
        System.out.println(String.format("单线程计算耗时：%d", getCurrentTime() - prevTime));

        prevTime = getCurrentTime();
        list.stream().parallel().reduce((a, b) -> a + b).ifPresent(System.out::println);
        System.out.println(String.format("多线程计算耗时：%d", getCurrentTime() - prevTime));
    }

    private static long getCurrentTime() {
        return System.currentTimeMillis();
    }
}

The article concludes with a reminder that many of these issues stem from inexperience, and systematic code reviews combined with solid indexing principles can dramatically improve database performance and code quality.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

SQL code review mysql Index Optimization Covering Index Database Performance Composite Index

Written by

Wukong Talks Architecture

Explaining distributed systems and architecture through stories. Author of the "JVM Performance Tuning in Practice" column, open-source author of "Spring Cloud in Practice PassJava", and independently developed a PMP practice quiz mini-program.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.