Databases 39 min read

Master MySQL Interview Questions: ACID, Indexes, Transactions, and More

This comprehensive guide covers MySQL fundamentals such as ACID transaction properties, isolation levels, index types and design principles, storage engines, MVCC mechanics, query execution flow, replication, locking strategies, and performance‑tuning techniques for large tables and complex queries.

JavaEdge

Dec 10, 2021

Master MySQL Interview Questions: ACID, Indexes, Transactions, and More

ACID Properties

The four ACID properties of a transaction are Atomicity (all operations succeed or all are rolled back), Consistency (the database remains in a valid state before and after the transaction), Isolation (concurrent transactions do not interfere; e.g., READ COMMITTED), and Durability (committed changes survive crashes).

Atomicity : All statements in a transaction are treated as a single unit.

Consistency : Total balances remain unchanged after a transfer.

Isolation : Controlled by isolation levels.

Durability : Changes are permanent after commit.

Transaction Isolation Levels

MySQL provides four isolation levels to prevent dirty reads, non‑repeatable reads, and phantom reads:

Serializable : Forces a total order, eliminates phantom reads.

REPEATABLE READ (default): Guarantees repeatable reads, prevents non‑repeatable reads.

READ COMMITTED : Prevents dirty reads; each SELECT sees only committed data.

READ UNCOMMITTED : Allows reading uncommitted changes (dirty reads).

Check the current level: SELECT @@transaction_isolation; Set a level:

SET SESSION TRANSACTION ISOLATION LEVEL READ UNCOMMITTED;

Indexes

What Is an Index?

An index is a data structure used by the storage engine to speed up row retrieval.

Advantages

Accelerates data lookup.

Speeds up sorting and grouping.

Improves join performance.

Disadvantages

Consumes additional disk space.

Slows down INSERT/UPDATE/DELETE because the index must be maintained.

When to Create an Index?

Columns frequently used in WHERE clauses.

Columns used for JOIN conditions.

Columns involved in ORDER BY.

When Not to Create an Index?

Columns not referenced in queries (e.g., not in WHERE).

Very small tables.

Columns that are frequently updated.

Columns used only in calculations.

Low‑cardinality columns such as gender.

Index Data Structures

MySQL mainly uses B+ Tree and Hash indexes. InnoDB defaults to B+ Tree.

B+ Tree Index

B+ Tree stores only keys in internal nodes and full records in leaf nodes, enabling efficient range queries. The tree height is typically 2–4, so a lookup requires only a few disk reads.

SHOW INDEX FROM blog\G;

Hash Index

Hash indexes provide O(1) lookups for exact matches but do not support range scans, ordering, or prefix matching.

Hash vs B+ Tree

Hash indexes do not support sorting or range queries.

Hash indexes cannot be used for prefix matching.

Hash collisions can cause unstable performance.

B+ Tree offers stable performance for all query types.

Why B+ Tree Is Preferred for Databases

All data resides in leaf nodes, making range scans fast.

Only keys are stored in internal nodes, allowing more keys per page and reducing I/O.

Search paths have uniform length, giving predictable performance.

Index Types

Primary Key Index : Unique, non‑null, named PRIMARY.

Unique Index : Enforces uniqueness, allows NULLs.

Composite Index : Indexes multiple columns; must follow the left‑most prefix rule.

Full‑Text Index : Available only on MyISAM; works on CHAR, VARCHAR, TEXT.

Left‑most Prefix Principle

If a composite index is defined on (a,b,c), queries can use the index for a, a,b, or a,b,c. Once a range condition (e.g., a > 5) appears, columns to the right are ignored.

Clustered Index

In InnoDB, the primary key forms a clustered index; leaf nodes store the full row data. If no primary key exists, InnoDB creates a hidden 6‑byte key.

Covering Index

A covering index contains all columns required by the SELECT, eliminating the need for a table lookup (no "back‑table" operation). Only B+ Tree indexes can serve as covering indexes.

EXPLAIN SELECT blog_id FROM user_like WHERE user_id = 13;

The Extra column shows Using index, indicating a covering index.

Index Design Principles

High cardinality columns yield better selectivity.

Prefer short indexes; use prefix lengths for long strings.

Avoid excessive indexes – each adds storage and maintenance overhead.

Leverage the left‑most prefix rule.

When Indexes Fail

Queries that do not start with the leftmost column of a composite index.

LIKE patterns beginning with %.

Implicit type conversion on string columns.

OR conditions.

Functions or arithmetic on indexed columns.

Prefix Index

Creates an index on the first N characters of a column to reduce size while retaining selectivity.

// Create a prefix index on the email column
ALTER TABLE table_name ADD KEY (email(10));

Storage Engines

InnoDB (default since 5.5): Transactional, row‑level locking, MVCC, clustered indexes.

MyISAM : Fast reads, no transactions or row locks, stores data in .MYD and indexes in .MYI.

MEMORY : Stores data in RAM, uses hash indexes, data lost on crash.

ARCHIVE : Highly compressed, write‑optimized, no indexes.

MyISAM vs InnoDB

Locking: MyISAM uses table‑level locks; InnoDB supports row‑level locks.

Transactions: MyISAM lacks transaction support; InnoDB provides ACID transactions.

Foreign Keys: Only InnoDB enforces them.

MVCC: Supported by InnoDB, not by MyISAM.

Clustered Index: InnoDB supports it; MyISAM does not.

MVCC Implementation

MVCC (Multiversion Concurrency Control) keeps multiple versions of a row using three hidden fields: DB_TRX_ID: Transaction ID that created the version. DB_ROLL_PTR: Pointer to the previous version (undo log). DB_ROLL_ID: Primary key (or generated hidden key).

When a row is updated:

Acquire an exclusive lock on the row.

Copy the old version to the undo log.

Write the new version, update DB_TRX_ID, and link DB_ROLL_PTR to the old version.

Read View

A read view captures the state of committed transactions at a point in time. Different isolation levels create read views at different moments (e.g., each SELECT for READ COMMITTED, once per transaction for REPEATABLE READ).

Snapshot Read vs Current Read

Snapshot Read : Uses MVCC, reads a consistent snapshot without locking.

Current Read : Reads the latest version, requires locks (e.g., SELECT ... FOR UPDATE).

Snapshot reads avoid phantom reads; current reads can encounter them.

Shared and Exclusive Locks

Examples:

SELECT * FROM table WHERE id < 6 LOCK IN SHARE MODE;  -- shared lock
SELECT * FROM table WHERE id < 6 FOR UPDATE;       -- exclusive lock

Shared locks allow concurrent reads; exclusive locks block other modifications until commit.

Optimizing Large Tables

Limit query ranges (e.g., by date).

Read‑write splitting with master‑slave architecture.

Vertical or horizontal sharding.

Binary Log, Redo Log, Undo Log

Key log types:

Binary Log (binlog) : Logical log of committed SQL statements; used for replication and point‑in‑time recovery.

Redo Log : Physical InnoDB log of page changes; written continuously, flushed on commit.

Undo Log : Stores before‑image of rows for rollback and MVCC.

Binlog vs Redo Log Differences

Scope: binlog records all engines; redo log records only InnoDB.

Write frequency: binlog writes once per transaction commit; redo log writes continuously.

Content: binlog stores SQL statements (logical); redo log stores physical page modifications.

MySQL Architecture

The server layer handles connections, query cache, parser, optimizer, and executor. Storage engines (InnoDB, MyISAM, etc.) manage actual data storage and retrieval.

Sharding (Database/Table Partitioning)

When a single table exceeds ~10 million rows or 100 GB, horizontal or vertical sharding reduces load.

Vertical Sharding

Separate tables by functional domains (e.g., product, order, user) or split a wide table into logical groups.

Horizontal Sharding

Distribute rows across multiple databases based on a rule (e.g., date, ID range).

Partition Tables

A partitioned table appears as a single logical table but stores data in multiple physical sub‑tables.

Range Partition

CREATE TABLE test_range_partition (
  id INT AUTO_INCREMENT,
  createdate DATETIME,
  PRIMARY KEY (id, createdate)
) PARTITION BY RANGE (TO_DAYS(createdate)) (
  PARTITION p201801 VALUES LESS THAN (TO_DAYS('20180201')),
  PARTITION p201802 VALUES LESS THAN (TO_DAYS('20180301')),
  ...
);

List Partition

CREATE TABLE test_list_partition (
  id INT AUTO_INCREMENT,
  data_type TINYINT,
  PRIMARY KEY (id, data_type)
) PARTITION BY LIST (data_type) (
  PARTITION p0 VALUES IN (0,1,2,3,4,5,6),
  PARTITION p1 VALUES IN (7,8,9,10,11,12),
  PARTITION p2 VALUES IN (13,14,15,16,17)
);

Hash Partition

CREATE TABLE test_hash_partition (
  id INT AUTO_INCREMENT,
  create_date DATETIME,
  PRIMARY KEY (id, create_date)
) PARTITION BY HASH (YEAR(create_date)) PARTITIONS 10;

Query Execution Flow

Steps: privilege check → query cache → parser → optimizer → privilege re‑check → executor → storage engine.

SELECT * FROM user WHERE id > 1 AND name = '大彬';

Validate user privileges.

Check query cache.

Lexical and syntactic analysis.

Optimizer chooses the best access path (e.g., index on id vs name).

Executor invokes the engine and returns results.

Update Execution Process

UPDATE user SET name = '大彬' WHERE id = 1;

Locate the target row (may use cache).

Write the change to the redo log in PREPARE state.

Record the change in the binlog.

Commit the redo log, moving it to COMMIT state.

Transaction completes.

EXISTS vs IN

EXISTS

evaluates the inner query row‑by‑row and stops on the first match, making it efficient for large inner tables. IN materializes the inner result set first, which can be faster when the outer table is large.

SELECT a.* FROM A WHERE EXISTS (SELECT 1 FROM B WHERE a.id = b.id);
SELECT * FROM A WHERE id IN (SELECT id FROM B);

TRUNCATE, DELETE, DROP Differences

All remove data; TRUNCATE and DROP are DDL and auto‑commit. TRUNCATE and DROP keep (or remove) the table structure; DELETE only removes rows.

Performance order: DROP > TRUNCATE > DELETE.

HAVING vs WHERE

WHERE

filters rows before grouping. HAVING filters groups after aggregation.

MySQL Master‑Slave Replication

The master writes changes to the binary log; slaves read the log and replay statements, providing read‑write separation, load balancing, and backup.

Why Use Replication?

Read‑write separation for higher concurrency.

Offload analytical workloads to slaves.

Data redundancy and disaster recovery.

Optimistic vs Pessimistic Locks

Pessimistic : Acquire a lock during the read phase; other transactions must wait.

Optimistic : No lock on read; on commit, verify a version column or use CAS to detect conflicts.

Processlist

Use SHOW PROCESSLIST or SHOW FULL PROCESSLIST to view active threads, their state, and the SQL they are executing. Important columns include Id, User, Host, Command, Time, State, and Info.

SHOW FULL PROCESSLIST;

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

mysql interview indexes

Written by

JavaEdge

First‑line development experience at multiple leading tech firms; now a software architect at a Shanghai state‑owned enterprise and founder of Programming Yanxuan. Nearly 300k followers online; expertise in distributed system design, AIGC application development, and quantitative finance investing.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.