Why Using SELECT * Is Inefficient in MySQL and How Indexes Improve Performance
Although many developers hear the advice to avoid using SELECT * in SQL, this article delves into the underlying reasons—such as increased parsing cost, unnecessary data transfer, extra I/O, and loss of covering-index optimization—while also explaining index concepts and best practices for MySQL performance.
In a mock interview, the interviewer asks about common SQL optimization techniques and why developers should avoid SELECT *. The answer hints at the inefficiency of selecting all columns, prompting a deeper technical discussion.
1. Reasons Why SELECT * Is Inefficient
The latest Alibaba Java Development Manual (Taishan edition) explicitly mandates never using * in queries; required columns must be listed explicitly.
Increases the cost of query parsing.
May cause mismatches between resultMap configurations and actual columns.
Unnecessary columns, especially large text fields, increase network traffic.
Additional detailed reasons include:
1.1 Unneeded columns increase data transmission time and network overhead
The database must parse more objects, fields, permissions, and attributes, adding load to the parser.
Extra columns (e.g., logs, large text blobs) enlarge the data size; when the DB and application are on different machines, the network cost becomes noticeable.
Even on the same machine, MySQL still uses the TCP protocol, incurring extra latency.
1.2 Large unused fields (VARCHAR, BLOB, TEXT) increase I/O operations
When a column exceeds 728 bytes, MySQL InnoDB stores the overflow elsewhere, causing an additional I/O operation when the row is read.
1.3 Using SELECT * prevents the MySQL optimizer from applying the "covering index" strategy
A covering index allows the optimizer to satisfy the query using only the index, avoiding a table lookup and dramatically improving speed.
Consider a table t(a,b,c,d,e,f) with a as the primary key and an index on b. Two B+ trees exist: one for the clustered index (all columns) and one for the secondary index (a, b). If the WHERE clause can use the b index, the optimizer can retrieve only a and b directly from the secondary index. Selecting * forces a second lookup on the clustered index, adding an extra B‑tree traversal and slowing the query.
Because the secondary index holds far fewer columns, many queries can be satisfied entirely from memory, while the clustered index may require disk access, leading to orders‑of‑magnitude performance differences.
2. Extending Index Knowledge
Beyond single‑column indexes, MySQL supports composite (multi‑column) indexes. A composite index (a,b,c) implicitly creates three indexes: (a), (a,b), and (a,b,c).
Think of a composite index like a book’s hierarchy: a is the first‑level chapter, b the second‑level section under a, and c the third‑level subsection. To use a lower‑level entry, the higher‑level entries must be traversed first.
2.1 Advantages of Composite Indexes
1) Reduced Overhead
Creating a composite index (a,b,c) effectively creates three indexes, but it reduces write‑time overhead compared to creating three separate indexes because the engine stores them in a shared structure.
2) Covering Index
For a query like:
SELECT a, b, c FROM table WHERE a='xx' AND b='xx';MySQL can satisfy the query directly from the composite index without accessing the table rows, eliminating a costly “back‑table” lookup and reducing random I/O.
3) Higher Efficiency
With a large table (e.g., 10 million rows), a query using a single‑column index may filter 10% (1 million rows) and then need to read each row from the table. A composite index (col1,col2,col3) can filter the same conditions down to 1% of 10% (10 000 rows), dramatically improving performance.
SELECT col1, col2, col3 FROM table WHERE col1=1 AND col2=2 AND col3=3;2.2 Should You Create More Indexes?
The answer is no. Over‑indexing brings several drawbacks:
Small tables don’t benefit from indexes; they add unnecessary maintenance cost.
Columns rarely queried should not be indexed, as the index provides little value.
Frequently updated columns cause extra write overhead.
Low‑cardinality columns (e.g., gender) are poor index candidates.
Each index consumes storage space and must be maintained during data changes.
3. Personal Reflections
If you’re reading this, you either have a passion for MySQL or enjoy scrolling through code. Hopefully you learned something useful; feel free to give a like if the article helped you.
When asked why I’m so strict about SQL standards, the answer is simple: I use SELECT * every day in small projects, but I also understand its pitfalls and want to share a concise, well‑structured guide.
Thank you for reading, and good luck in your next interview!
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Java Captain
Focused on Java technologies: SSM, the Spring ecosystem, microservices, MySQL, MyCat, clustering, distributed systems, middleware, Linux, networking, multithreading; occasionally covers DevOps tools like Jenkins, Nexus, Docker, ELK; shares practical tech insights and is dedicated to full‑stack Java development.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
