Databases 13 min read

Understanding MySQL Hash Index Implementation and Use Cases

This article explains how MySQL implements hash indexes, illustrates their data distribution on memory tables, and evaluates their suitable scenarios and limitations compared with B‑tree indexes through detailed SQL examples and performance analysis.

Aikesheng Open Source Community

Dec 30, 2020

Understanding MySQL Hash Index Implementation and Use Cases

This article explains how MySQL implements hash indexes, where they are applied in storage engines and internal structures, and evaluates their performance characteristics and appropriate use cases.

1. Hash Data Distribution

Hash indexes are mainly used in memory tables, such as the Memory engine or MySQL 8.0 TempTable engine. The size of a memory table is controlled by max_heap_table_size, which includes both data and index storage.

Example of creating a memory table with a primary key that defaults to a hash index, inserting several rows, and visualizing the hash distribution:

# MySQL 内存表主键默认哈希索引
mysql> create table t1(id int , name varchar(64), gender char(1), status char(2),primary key (id)) engine memory;
Query OK, 0 rows affected (0.02 sec)

mysql> insert into t1 values(101,'张三','男','未婚');
Query OK, 1 row affected (0.00 sec)
...
mysql> insert into t1 values(106,'余欢水','男','二婚');
Query OK, 1 row affected (0.01 sec)

Adding a hash index on the name column demonstrates potential hash collisions:

mysql> alter table t1 add key idx_name(name) using hash;
Query OK, 6 rows affected (0.04 sec)

mysql> insert into t1 values(107,'杨阳','男','二婚');
Query OK, 1 row affected (0.00 sec)

2. Usage Scenarios

To compare hash indexes with B‑tree indexes, a clone table t2 is created using B‑tree indexes.

# 省略表 t1 造数据过程
mysql> create table t2 like t1;
Query OK, 0 rows affected (0.02 sec)

mysql> alter table t2 drop primary key, drop key idx_name;
Query OK, 0 rows affected (0.04 sec)

mysql> alter table t2 add primary key (id) using btree, add key idx_name (name) using btree;
Query OK, 0 rows affected (0.04 sec)

mysql> insert into t2 select * from t1;
Query OK, 50000 rows affected (0.18 sec)

2.1 Hash indexes are ideal for equality lookups

They work with the operators "=", "IN" and "<=>". Example queries show fast retrieval using hash indexes for primary‑key equality and small IN lists, while range queries cannot use hash indexes.

# SQL 1
mysql> select * from t1 where id = 2000;
...
# SQL 2
mysql> select * from t2 where id in (1000,2000);
...
# SQL 3 (range query – cannot use hash index)
mysql> select count(*) from t1 where id between 1000 and 2000;
...

Explain plans confirm that the range query performs a full table scan on the hash‑indexed table but uses a B‑tree range scan on the B‑tree table.

mysql> explain select count(*) from t1 where id between 1000 and 2000\G
... type: ALL ...

mysql> explain select count(*) from t2 where id between 1000 and 2000\G
... type: range ...

2.2 Hash indexes cannot support ordered retrieval

Because hash indexes are unordered, ORDER BY requires a full scan and filesort, whereas the B‑tree index can perform a backward index scan.

# SQL 5 (hash index)
mysql> select id from t1 where 1 order by id desc limit 1;
...
# SQL 6 (B‑tree index)
mysql> select id from t2 where 1 order by id desc limit 1;
...

mysql> explain select id from t1 where 1 order by id desc limit 1\G
... type: ALL ... Extra: Using filesort

mysql> explain select id from t2 where 1 order by id desc limit 1\G
... type: index ... Extra: Backward index scan

2.3 Hash indexes do not provide covering indexes

Since the index stores only the hash value, any lookup must read the base table row, leading to many handler reads compared with a B‑tree index.

# SQL 7 (hash index) – many handler reads
mysql> select id from t1 limit 1;
...
show status like 'handler%';
... Handler_read_rnd_next 849 ...

# SQL 8 (B‑tree index) – single handler read
mysql> select id from t2 limit 1;
...
show status like 'handler%';
... Handler_read_rnd_next 1 ...

2.4 Hash indexes cannot be used for left‑most prefix queries

For a composite hash index (e.g., on columns (id1,id2,id3)), the index can only be used when all indexed columns are specified; single‑column predicates cannot use the hash index.

mysql> create table t3(id1 int,id2 int,r1 int, primary key(id1,id2)) engine memory;
...
mysql> create table t4(id1 int,id2 int,r1 int, primary key(id1,id2) using btree) engine memory;
...
mysql> select * from t3 where id1 = 44;  -- full scan
mysql> select * from t4 where id1 = 44;  -- uses primary B‑tree index

mysql> explain select * from t3 where id1 = 44\G
... type: ALL ...

mysql> explain select * from t4 where id1 = 44\G
... type: ref ... key: PRIMARY ...

The article concludes that MySQL hash indexes are suitable for fast equality lookups on memory tables but have limitations such as lack of ordering, no covering capability, and inability to handle range or prefix queries, making B‑tree indexes preferable in many scenarios.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Performance MySQL database indexing Hash Index memory engine

Written by

Aikesheng Open Source Community

The Aikesheng Open Source Community provides stable, enterprise‑grade MySQL open‑source tools and services, releases a premium open‑source component each year (1024), and continuously operates and maintains them.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.