Databases 13 min read

Why MySQL count(*) Slows Down on Large Tables and How to Optimize It

This article explains how MySQL's count() works across different storage engines, compares the performance of various count() forms, and offers practical strategies—such as using EXPLAIN rows, auxiliary count tables, and batch processing—to obtain accurate or approximate row counts efficiently even on massive tables.

macrozheng

Sep 30, 2022

Why MySQL count(*) Slows Down on Large Tables and How to Optimize It

Creating an SMS Table

Assume a table sms that stores messages to be sent, with a state column where 0 means unsent.

An asynchronous thread repeatedly fetches rows with state = 0, sends the SMS, and updates state to 1, gradually reducing the number of unsent messages.

Why count() Becomes Slow on Large Tables

To monitor how many unsent messages remain, we typically run: select count(*) from sms where state = 0; When the table is small this works fine, but with millions of rows the query may time out.

How count() Works

MySQL consists of a server layer and a storage‑engine layer. The storage engine (e.g., InnoDB or MyISAM) determines how the row count is obtained.

MyISAM keeps a hidden counter that can be read directly, making count(*) extremely fast. InnoDB, however, must traverse the smallest index tree and count leaf nodes, which requires scanning many rows.

Why InnoDB Cannot Use a Simple Counter

InnoDB supports transactions and MVCC with the default REPEATABLE READ isolation level. Because each transaction sees a consistent snapshot, the visible row count can differ between concurrent transactions, so a single stored counter would break isolation guarantees.

Performance of Different count() Forms

The server layer always receives rows from InnoDB and then decides whether to count them based on the expression inside count():

count(*) and count(1) simply increment for every row because the value can never be NULL.

count(primary_key) also skips NULL checks because primary keys are NOT NULL.

count(indexed_column) may need to read the column value and check for NULL unless the column is defined NOT NULL.

count(non_indexed_column) usually triggers a full‑table scan and NULL checks.

count(*) ≈ count(1) > count(primary_key) > count(indexed_column) > count(non_indexed_column)

Thus count(*) is already the fastest option, but it still requires scanning rows in InnoDB.

When Approximate Counts Are Sufficient

If you only need to know the order of magnitude of unsent messages, you can use the rows estimate from EXPLAIN, which samples data and provides a quick approximation.

When Exact Counts Are Required

For precise monitoring you can maintain a separate table that stores pre‑computed counts:

CREATE TABLE `count_table` (
  `id` int NOT NULL AUTO_INCREMENT COMMENT 'Primary key',
  `cnt_what` char(20) NOT NULL DEFAULT '' COMMENT 'Metric name',
  `cnt` tinyint NOT NULL COMMENT 'Count value',
  PRIMARY KEY (`id`),
  KEY `idx_cnt_what` (`cnt_what`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;

Insert or update the count whenever the number of unsent SMS changes, then read it instantly:

select cnt from count_table where cnt_what = "未发送的短信数量";

High‑Real‑Time Scenario

Include the count‑updating SQL in the same transaction that inserts or deletes SMS rows. This keeps the count consistent with transaction isolation, but may cause contention under heavy write load.

Low‑Real‑Time Scenario

If updates can be delayed (e.g., once a day), you can periodically scan the SMS table in batches, count rows with state = 0, and store the result in count_table. Alternatively, stream binlog changes to Hive and compute the count there.

Summary

MySQL count() reads full‑table data; MyISAM can read a stored row‑count field, while InnoDB must scan rows.

Performance order:

count(*) ≈ count(1) > count(primary_key) > count(indexed_column) > count(non_indexed_column)

, but even the fastest still scans rows in InnoDB.

For most monitoring use EXPLAIN 's rows estimate to get a quick, approximate count.

When exact numbers are needed, maintain a dedicated count table and update it either within the same transaction (high real‑time) or via periodic batch jobs (low real‑time).

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

InnoDB mysql sql-optimization MyISAM COUNT

Written by

macrozheng

Dedicated to Java tech sharing and dissecting top open-source projects. Topics include Spring Boot, Spring Cloud, Docker, Kubernetes and more. Author’s GitHub project “mall” has 50K+ stars.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.