Can Disk Space Be Saved in MySQL by Adding a Primary Key?
The article demonstrates that adding an explicit primary key to a MySQL InnoDB table can dramatically reduce its on‑disk size by eliminating the hidden clustered index, and explains how invisible primary keys can be used when schema changes are not possible.
MySQL tables do not require a primary key, but in MySQL Group Replication (MGR) and Percona XtraDB Cluster (PXC) a primary key is mandatory because the lack of one severely degrades replication performance.
This post quickly shows another reason to define a primary key: saving disk space.
First, a simple test table test1 is created with two indexed columns a (int NOT NULL) and b (bigint). Inserting 10 million rows occupies about 748 MiB.
mysql> show create table test1\G
... CREATE TABLE `test1` (
`a` int NOT NULL,
`b` bigint DEFAULT NULL,
KEY `a` (`a`),
KEY `b` (`b`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ciAfter converting the index on column a to a primary key and dropping the secondary key, the table is rebuilt and its size drops to 588 MiB – a very significant reduction.
The reason is that InnoDB automatically creates an internal clustered index called GEN_CLUST_INDEX when no explicit primary key exists. This hidden index stores the entire row, so its size overhead is large.
When an explicit primary key is added, the hidden index disappears and only the two defined indexes remain, reducing both CLUST_INDEX_SIZE and OTHER_INDEX_SIZE as shown by queries against information_schema.INNODB_INDEXES .
mysql> select SPACE,INDEX_ID,i.NAME as index_name, t.NAME as table_name,
CLUST_INDEX_SIZE,OTHER_INDEX_SIZE
from information_schema.INNODB_INDEXES i
join information_schema.INNODB_TABLESPACES t using (space)
join information_schema.INNODB_TABLESTATS ts
where t.NAME=ts.NAME and t.NAME='db1/test1'\G
... SPACE: 54
INDEX_ID: 237
index_name: b
CLUST_INDEX_SIZE: 23733
OTHER_INDEX_SIZE: 13041
... SPACE: 54
INDEX_ID: 236
index_name: PRIMARY
CLUST_INDEX_SIZE: 23733
OTHER_INDEX_SIZE: 13041If adding a new column as a primary key is impossible due to legacy applications, MySQL 8.0 offers an invisible primary key (GIPK). Enabling sql_generate_invisible_primary_key=1 and then creating the table adds an auto‑increment column marked INVISIBLE , which acts as the clustered key without being visible to the application.
mysql> set sql_generate_invisible_primary_key=1;
Query OK, 0 rows affected (0.00 sec)
mysql> create table nopk (a int);
Query OK, 0 rows affected (0.02 sec)
mysql> show create table nopk\G
... CREATE TABLE `nopk` (
`my_row_id` bigint unsigned NOT NULL AUTO_INCREMENT /*!80023 INVISIBLE */,
`a` int DEFAULT NULL,
PRIMARY KEY (`my_row_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ciThe invisible column can still be used in queries when needed, providing the performance benefits of a primary key while remaining hidden from existing code.
Before enforcing sql_require_primary_key , it is advisable to enable sql_generate_invisible_primary_key and recreate the data using logical backup/restore, as simple table optimizations will not add an invisible key.
Aikesheng Open Source Community
The Aikesheng Open Source Community provides stable, enterprise‑grade MySQL open‑source tools and services, releases a premium open‑source component each year (1024), and continuously operates and maintains them.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.