Optimizing a Slow MySQL 5.7 Query for Detecting Tables Without Primary or Unique Keys
This article examines why a MySQL 5.7 query that finds tables lacking primary or unique keys runs extremely slowly, demonstrates an experiment reproducing the issue, analyzes the optimizer's subquery rewrite, and shows how adding hints and materialization reduces execution time from over 16 seconds to under one second.
Problem: A SQL query that finds tables without primary or unique keys runs extremely slowly on MySQL 5.7.
Experiment: Built a MySQL 5.7 test environment, created a mix of tables with and without primary keys, and executed the query, observing a runtime of about 16.8 seconds.
Analysis: The optimizer rewrites the non‑correlated subquery select ... where A.x not in (select x from B) into a correlated form select ... where not exists (select 1 from B where B.x = a.x) . Because information_schema.columns is a metadata table without statistics, the optimizer cannot estimate costs correctly, leading to a costly row‑by‑row scan.
Solution: Added optimizer hints to force materialization (caching) of the subquery result, guiding MySQL to treat it as a non‑correlated query. After applying the hint, execution time dropped to 0.67 seconds.
Key points:
Metadata tables in information_schema provide no useful statistics for the optimizer.
The optimizer may mis‑judge subquery types, turning a cheap non‑correlated query into an expensive correlated one.
Using hints to enforce materialization can dramatically improve performance.
Future work: Use the optimizer_trace feature to further investigate the optimizer’s mis‑prediction.
Aikesheng Open Source Community
The Aikesheng Open Source Community provides stable, enterprise‑grade MySQL open‑source tools and services, releases a premium open‑source component each year (1024), and continuously operates and maintains them.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.