Databases 20 min read

Master SQL Optimization: Index Design, EXPLAIN, and Safe Online DDL for Interview Success

This guide walks you through the fundamentals of MySQL performance tuning—covering the optimization pyramid, how to read EXPLAIN plans, practical index design, safe online DDL for large tables, and interview‑ready case studies that turn SQL expertise into a powerful hiring advantage.

Su San Talks Tech
Su San Talks Tech
Su San Talks Tech
Master SQL Optimization: Index Design, EXPLAIN, and Safe Online DDL for Interview Success

1. Core Points of Database Performance Optimization

Before tweaking a specific SQL statement, you need a global view of where SQL optimization fits in the overall database performance hierarchy. Think of the optimization process as a pyramid:

Hardware layer : upgrade CPU, add memory, replace spinning disks with NVMe SSDs.

System layer : tune OS kernel parameters such as network buffers and file‑handle limits.

Database software layer : adjust DBMS settings (e.g., MySQL innodb_flush_log_at_trx_commit to balance durability and speed).

Application‑layer SQL optimization : the top of the pyramid where developers can directly rewrite queries.

Lower layers affect a broader range of queries but require deeper expertise and higher privileges; the top layer offers the highest return on effort for most developers.

1.1 Understanding EXPLAIN Execution Plans

The EXPLAIN command is the primary diagnostic tool. It returns columns such as type, possible_keys, key, rows, and filtered. The type column ranks access methods from best to worst: system > const > eq_ref > ref > range > index > ALL. ALL indicates a full‑table scan and is the first target for optimization.

-- Example: Analyze a sorting pagination query
EXPLAIN SELECT * FROM tx_user.user ORDER BY uid DESC LIMIT 5;

Key columns to focus on: type: access type, with system and const being ideal. possible_keys: candidate indexes the optimizer considered. key: the index actually used (NULL means no index). rows: estimated number of rows examined; smaller is better. filtered: percentage of rows that satisfy the WHERE clause after the scan.

Optimization follows an iterative "analyze‑hypothesize‑validate" cycle, repeatedly using EXPLAIN to verify improvements.

1.2 Index Design

Create indexes on columns frequently used in WHERE predicates.

Index columns involved in ORDER BY to avoid costly filesort.

Index foreign‑key and join columns to prevent nested‑loop join blow‑ups.

Prefer high‑cardinality columns; when building composite indexes, place the most selective column first.

1.3 Risks of DDL on Large Tables and Mitigation Strategies

Running ALTER TABLE on a massive table acquires an exclusive metadata lock (MDL), blocking all reads and writes until the operation finishes. For tables with tens of millions of rows, this can take minutes or hours, which is unacceptable in production.

Three common mitigation approaches:

Planned downtime : pause the service during a maintenance window and run the DDL.

Low‑traffic window : execute the change during off‑peak hours, accepting the risk of overrunning into peak time.

Online DDL via a shadow table ("golden‑silkworm" technique):

Create a new table with the desired schema and indexes.

Copy the full data set from the original table to the shadow table.

Synchronize incremental changes (e.g., via triggers or binlog listeners) while copying.

Atomically rename the original table to a backup name and rename the shadow table to the original name, then switch traffic.

Performance optimization pyramid
Performance optimization pyramid

2. Interview‑Ready Scenarios

Catalog project experience : list core business tables, evaluate existing indexes, and run EXPLAIN on the most frequent queries.

Retrospect real cases : describe problem discovery, analysis, solution, and quantitative results (e.g., response time, CPU usage).

Anticipate deep follow‑up questions : prepare answers for why the optimizer chose a certain plan, concurrency implications, and data‑consistency concerns.

2.1 Using Covering Indexes to Eliminate Table Lookups

Problem : a high‑frequency query used SELECT * but only needed columns A, B, C, causing many “Table Access by Index Rowid” operations.

Solution : create a composite index that includes A, B, C and rewrite the query to SELECT A, B, C. The index becomes a covering index, allowing MySQL to satisfy the query entirely from the index leaf nodes. Execution time dropped from hundreds of milliseconds to under 1 ms.

2.2 Leveraging Index Order to Optimize Sorting

Problem : a query with WHERE uid = ? ORDER BY create_time DESC triggered a costly filesort as the user’s data grew.

Solution : create a composite index on (uid, create_time). Because the index stores rows ordered by create_time within each uid, MySQL can read rows in the required order without an extra sort, reducing latency from seconds to tens of milliseconds.

2.3 Optimizing COUNT(*)

In InnoDB, SELECT COUNT(*) forces a full‑table scan. Two strategies:

Approximate count : run EXPLAIN on a similar query and use the rows estimate as an approximation.

Exact count via external store : maintain a counter in Redis (or another fast store) that is updated alongside INSERT and DELETE operations. Ensure consistency with techniques such as binlog‑driven asynchronous updates.

2.4 Cautious Use of Index Hints

When the optimizer picks a sub‑optimal index and rewriting the query cannot help, FORCE INDEX can force the desired index. Use this as a last resort because it reduces maintainability—future data distribution changes may render the hint harmful.

2.5 Correct Division of WHERE and HAVING

Filters on raw table columns belong in WHERE, which executes before GROUP BY. Moving such conditions from HAVING to WHERE can cut the processed row set dramatically; an example showed a 40 % performance gain.

2.6 Breaking the Deep‑Pagination Performance Trap

Offset‑based pagination ( LIMIT offset, count) scans offset + count rows, making large offsets prohibitively slow. Replace it with keyset pagination: use the last row’s primary key (or a unique ordered column) as a cursor, e.g., WHERE id > last_id ORDER BY id ASC LIMIT page_size. This keeps the offset at zero and maintains millisecond‑level latency even on deep pages.

3. Summary

SQL optimization revolves around three pillars: mastering EXPLAIN to diagnose bottlenecks, designing high‑selectivity covering indexes, and applying safe online DDL techniques for large tables. Armed with concrete interview case studies, you can demonstrate deep database expertise and turn performance improvements into compelling interview narratives.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

SQLindex designmysqlOnline DDLInterview Preparationexplain
Su San Talks Tech
Written by

Su San Talks Tech

Su San, former staff at several leading tech companies, is a top creator on Juejin and a premium creator on CSDN, and runs the free coding practice site www.susan.net.cn.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.