Databases 13 min read

Postgres vs MySQL: Deep Dive into Indexes, Query Costs, and MVCC

An in‑depth comparison of PostgreSQL and MySQL examines how primary and secondary indexes are implemented, contrasts query execution costs, explores data type impacts, explains MVCC undo logs, and discusses process versus thread architectures, highlighting performance trade‑offs for each system.

dbaplus Community
dbaplus Community
dbaplus Community
Postgres vs MySQL: Deep Dive into Indexes, Query Costs, and MVCC

1. Indexes

Both databases use B+‑tree structures for indexes, but they differ in how primary keys and secondary keys store values. In MySQL, a primary‑key index stores the entire row (clustered index), while secondary indexes store a pointer to the primary key. PostgreSQL treats every index as a secondary index that points to a tuple ID (tid) in the heap; there is no separate primary‑key index concept.

MySQL requires every table to have a primary key, and secondary indexes ultimately reference that primary key. PostgreSQL stores only the fixed‑size 4‑byte tuple ID, so even a UUID primary key does not bloat secondary indexes.

2. Query Cost

Consider a table T with a primary key PK, a secondary indexed column C2, and an unindexed column C1:

# 表 T;
# 主键列 PK 上有主索引,C2 列上有二级索引,C1 没有索引;
# C1 和 C2 是文本类型,PK 是整数。
| PK | C1 | C2 |
|----|----|----|
| 1  | x1 | x2 |
| 2  | y1 | y2 |
| 3  | z1 | z1 |

Query SELECT * FROM T WHERE C2 = 'x2'; in MySQL triggers two B+‑tree lookups: one on the secondary index to obtain the primary key, then a second on the primary‑key index to fetch the full row. PostgreSQL performs a secondary‑index lookup to get the tuple ID, followed by a single heap page access to retrieve the row.

When the predicate matches multiple rows, MySQL may repeat primary‑key lookups for each match, while PostgreSQL can use a Bitmap Index Scan to aggregate matching tuple IDs by page and load those pages in bulk, reducing random I/O.

Range queries on the primary key ( SELECT * FROM T WHERE PK BETWEEN 1 AND 3;) are faster in MySQL because the clustered index allows a single lookup and sequential leaf‑node traversal. PostgreSQL must fetch each matching tuple from the heap, incurring additional random reads.

3. Data Types

In MySQL, the primary‑key data type directly influences secondary‑index size because each secondary entry stores the full primary‑key value (e.g., a UUID makes secondary indexes larger). PostgreSQL’s secondary indexes always store the fixed‑size tuple ID, so primary‑key type has little impact on index size.

4. Undo Logs (MVCC)

Both systems implement Multi‑Version Concurrency Control. MySQL writes the new row version to the shared buffer page and records how to undo the change in an Undo Log. During recovery, the Undo Log rolls back uncommitted changes.

PostgreSQL creates a new tuple for every UPDATE/DELETE, assigning a new tuple ID and recording the creating and deleting transaction IDs. This design enables safe concurrent reads of older versions without separate undo logs, but it adds write‑amplification for secondary indexes. Vacuum later cleans up dead tuples.

5. Processes vs Threads

MySQL uses a thread‑based architecture, while PostgreSQL employs a process‑based model. Threads share the parent process’s virtual memory, making them lighter weight; processes have independent memory spaces and larger control blocks, incurring higher resource overhead.

Query OptimizationMySQLPostgreSQLIndexesMVCC
dbaplus Community
Written by

dbaplus Community

Enterprise-level professional community for Database, BigData, and AIOps. Daily original articles, weekly online tech talks, monthly offline salons, and quarterly XCOPS&DAMS conferences—delivered by industry experts.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.