Databases 7 min read

Why SELECT * Slows Down Your Database and How to Avoid It

The article recounts a 2012 incident where a seemingly fast backend API became sluggish after hidden blob columns were added, explains how SELECT * forces full table scans, extra deserialization, network overhead, and unpredictable performance, and advises selecting only needed columns for optimal efficiency.

21CTO

Dec 16, 2024

Why SELECT * Slows Down Your Database and How to Avoid It

Story from 2012

A developer recounts a real case from 2012‑2013 where a backend API that normally responded in a few milliseconds suddenly became slow for users.

Code reviews showed no abnormal changes, and even after rolling back all commits the slowdown persisted.

Diagnosing the slowdown

API response times occasionally rose to 500 ms–2 s, whereas they used to be a few milliseconds.

The team investigated the database queries and discovered that the table had three new BLOB fields added by another application, although the original table only had two integer columns.

How database reads work

In row‑store engines, rows are stored in pages, each page containing a header and multiple rows with column data.

When a page is loaded into the shared buffer pool, all rows and columns become accessible.

Even though the extra BLOB columns are not returned to the client, the backend API still fetches them, increasing database, network, and serialization overhead.

Leaving index scans

Using SELECT * prevents the optimizer from using an index‑only scan. For example, if you need student IDs with scores above 90 and there is an index on the score column, the index can satisfy the query without touching the heap.

Because SELECT * requests all columns, the database must also read the heap pages for the remaining columns, causing many random I/O operations.

Deserialization cost

Deserialization converts raw bytes into data types, a process that adds CPU work.

When executing SELECT *, the database must deserialize every column, even those not needed by the application, increasing computational overhead and reducing query performance.

Not all columns are inline

Large columns such as text or BLOBs are often stored out‑of‑line (e.g., PostgreSQL TOAST tables) and fetched only on demand.

Fetching many such columns forces the database to retrieve, decompress, and serialize additional data, adding load.

Network cost

Result rows are serialized according to the database protocol before being sent over TCP/IP; more data means more CPU work and larger packets, increasing latency.

Returning all columns can force clients to handle unnecessary large fields, further slowing deserialization on the client side.

Unpredictability

Even a table with one or two simple columns can become slow if administrators later add XML, JSON, or BLOB columns that the application never uses.

The query remains fast until those extra columns are added, at which point SELECT * starts pulling unnecessary data.

Using code grep

Explicit column lists make it easy to grep the codebase for column usage, simplifying schema refactoring and DDL changes.

Conclusion

In summary, SELECT * incurs many hidden costs—extra I/O, deserialization, network overhead, and unpredictability—so it is best to select only the columns you truly need, unless the table is tiny with simple data types.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Query Optimization PostgreSQL Database Performance select

Written by

21CTO

21CTO (21CTO.com) offers developers community, training, and services, making it your go‑to learning and service platform.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.