Databases 15 min read

Choosing the Best InnoDB Primary Key: B‑Tree Fundamentals and Performance Implications

This article explains how InnoDB stores data using B‑Tree primary keys, why explicit primary keys are essential, and compares ordered versus random insert patterns, illustrating their impact on page fill factor, index size, and query performance with practical sysbench examples.

Aikesheng Open Source Community
Aikesheng Open Source Community
Aikesheng Open Source Community
Choosing the Best InnoDB Primary Key: B‑Tree Fundamentals and Performance Implications

InnoDB is an index‑organized storage engine that stores rows in a B‑Tree using the primary key, and it always requires a primary key; if none is defined it creates a hidden 6‑byte auto‑increment key.

The B‑Tree structure stores data pages of 16 KB, with leaf pages containing the actual rows and internal pages holding keys and pointers. The number of records per page depends on the key size and fill factor.

Using a sysbench table with an auto‑increment integer primary key demonstrates ordered inserts: the Data_length (≈644 MB) and Index_length reflect a high page fill factor (~75 %). Queries on such a table benefit from sequential page access.

When rows are inserted in random order, page fill drops to ~65‑75 %, causing page splits and larger index size (≈1 GB, 60 % larger). The example shows the effect of the innodb_fill_factor setting.

mysql> show create table sbtest1\G
CREATE TABLE `sbtest1` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `k` int(11) NOT NULL DEFAULT '0',
  `c` char(120) NOT NULL DEFAULT '',
  `pad` char(60) NOT NULL DEFAULT '',
  PRIMARY KEY (`id`),
  KEY `k_1` (`k`)
) ENGINE=InnoDB AUTO_INCREMENT=3000001 DEFAULT CHARSET=latin1

Workload identification is crucial: insert‑heavy workloads should use sequential primary keys, while read‑heavy workloads benefit from primary keys that group related rows (e.g., user_id, friend_user_id) and appropriate secondary indexes.

For a read‑intensive “friends” table, the primary key (user_id, friend_user_id) groups rows by user, reducing page reads per query. Sample table definition and query output illustrate the improvement.

CREATE TABLE `friends` (
  `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `user_id` int(10) unsigned NOT NULL,
  `friend_user_id` int(10) unsigned NOT NULL,
  `created` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
  `active` tinyint(4) NOT NULL DEFAULT '1',
  PRIMARY KEY (`user_id`,`friend_user_id`),
  KEY `idx_friend` (`friend_user_id`),
  KEY `idx_id` (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=144002 DEFAULT CHARSET=latin1

Using pt‑query‑digest to analyze queries shows reduced page distinct counts and rows examined when the primary key aligns with the access pattern, lowering IOPS requirements.

# Attribute    pct   total     min     max     avg     95%  stddev  median
# pages distin 100     111       2       5    2.09    1.96    0.44    1.96

When no clear access pattern exists, covering indexes or partitioning can help. The article concludes that choosing an appropriate InnoDB primary key dramatically improves performance for both insert‑ and read‑heavy workloads.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Performance OptimizationInnoDBmysqlB+Treedatabase indexingprimary key
Aikesheng Open Source Community
Written by

Aikesheng Open Source Community

The Aikesheng Open Source Community provides stable, enterprise‑grade MySQL open‑source tools and services, releases a premium open‑source component each year (1024), and continuously operates and maintains them.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.