Common MySQL Interview Questions and Answers
This article provides a comprehensive collection of typical MySQL interview questions covering storage engines, lock types, gap locks, deadlock avoidance, isolation levels, index types, covering indexes, left‑most prefix rule, replication, distributed transactions, optimization techniques, sharding challenges and solutions, and global unique ID generation.
Hello, I'm Tom. To help you quickly locate and understand common MySQL interview questions, I have organized a catalog of topics.
What are the differences between MyISAM and InnoDB? InnoDB supports transactions, foreign keys, and clustered indexes, uses MVCC for high concurrency, and stores indexes and data together. Counting rows with select count(*) from table requires a full‑table scan because InnoDB does not store the exact row count, whereas MyISAM keeps a variable with the total row count. InnoDB uses row‑level locks (the smallest granularity), while MyISAM uses table‑level locks, resulting in lower concurrency. InnoDB is the default storage engine.
What lock types does MySQL have? MySQL provides shared (S) locks and exclusive (X) locks, also known as read and write locks. Locks can be at table, page, or row granularity.
What is a gap lock? A gap lock appears under the REPEATABLE READ isolation level; MySQL creates intervals that are left‑open and right‑closed ( 左开右闭) to prevent phantom reads when combined with MVCC.
How to avoid deadlocks? Follow these practices: design high‑selectivity indexes, place high‑cardinality columns first in composite indexes, order SQL statements to avoid long‑running updates before other statements, split large transactions, access tables and rows in a fixed order, avoid explicit locking inside transactions, use primary key/index lookups, and simplify complex SQL by breaking joins into smaller queries.
What are MySQL isolation levels? READ UNCOMMITTED, READ COMMITTED, REPEATABLE READ (default, may cause phantom reads), and SERIALIZABLE.
What index types does MySQL support? Ordinary index (single column), unique index (unique values, allows NULL), composite index (multiple columns for combined searches), clustered index (primary key, data stored in leaf nodes, one per table), and non‑clustered index (leaf nodes store only index columns and primary key, may require a back‑table lookup; a covering index can avoid this).
What are covering indexes and back‑table lookups? A covering index contains all columns needed for a query, eliminating the need for a back‑table lookup. Example: select buyer_id from order where money>100. If a composite index on (money, buyer_id) exists, the leaf nodes store buyer_id, so the query can be satisfied without a back‑table lookup. A back‑table lookup occurs when some required fields are not in the index and must be fetched from the primary key B‑tree.
What is the leftmost prefix principle? When using a composite index, MySQL can only use the leftmost contiguous columns until a range condition (e.g., >, <, BETWEEN, LIKE) appears. For example, with an index on (a,b,c,d), the query where a=1 and b=2 and c>3 and d=4 can use columns a, b, c but not d.
Online SQL tuning experience: Use the slow query log and explain to check index usage, reduce rows scanned, create appropriate composite indexes following the leftmost prefix rule, and consider virtual columns to improve complex query performance.
Why does MySQL recommend using an auto‑increment ID as the primary key? Auto‑increment IDs are sequential, so inserts always occur at the end of the table, reducing page splits and data movement. Avoid using string keys such as UUIDs for primary keys.
Why are B+ trees used for indexes instead of B‑trees or red‑black trees? B+ trees store all data in leaf nodes and only keep keys and pointers in internal nodes, resulting in a flatter tree with lower height and fewer I/O operations. This structure also links leaf nodes for fast range scans.
What are the properties of a transaction? ACID: Atomicity, Consistency, Isolation, and Durability.
How to implement distributed transactions? Common approaches include: 1) asynchronous tasks with eventual consistency (requiring idempotent APIs), 2) transactional messages, 3) two‑phase commit, 4) three‑phase commit, 5) TCC (Try‑Confirm‑Cancel), 6) using the Seata framework, and others.
Daily MySQL optimization practices: Use pagination optimization (e.g., replace limit 100000,10 with a primary‑key range query id>#{value}), prefer covering indexes to avoid back‑table lookups, perform SQL optimizations (index tuning, small‑table driving large tables, virtual columns, adding redundant fields to reduce joins, composite indexes, sorting optimization, slow‑query analysis with explain), design optimizations (avoid NULL, use simple data types, limit TEXT columns, sharding), and hardware optimizations (SSD, sufficient network bandwidth, ample memory).
How does MySQL master‑slave replication work? The master writes data changes to the binary log (binlog). A log‑dump thread notifies slaves. Slaves request the binlog from a specific position, store it in the relay log, and an SQL thread reads the relay log and replays the events (redo) to keep data synchronized.
What is master‑slave delay? It is the time difference between a write completing on the master and the data being fully replicated to the slave. It can be calculated as t2‑t1, where t1 is the timestamp in the binlog and t2 is the execution time on the slave.
How to check master‑slave delay? Run show slave status and examine the Seconds_Behind_Master value: 0 means healthy replication; a positive number indicates delay, with larger values meaning more severe lag.
How to resolve master‑slave delay? Strategies include: forcing reads to the master if delay is unacceptable, using caches with immediate updates, upgrading slave hardware, reducing network latency, using Canal for incremental subscription, minimizing large transactions, enabling multi‑threaded replication in MySQL 5.7+ (setting slave_parallel_workers and slave_parallel_type to LOGICAL_CLOCK), and employing floating IP failover based on delay thresholds.
What to do when data volume is too large? Consider sharding, caching, read/write separation, vertical splitting, cold‑hot data separation, using Elasticsearch for complex searches, NoSQL or NewSQL solutions.
How to guarantee globally unique IDs after sharding? Options include UUIDs, database auto‑increment IDs, database segment allocation (pre‑allocating a range of IDs per service), Redis atomic incr, Snowflake algorithm, or open‑source generators such as Baidu uid‑generator, Meituan Leaf, or Didi Tinyid.
What problems may arise after sharding? The presence of a sharding key (e.g., sharding_key) can limit query flexibility. Solutions include separating buyer and seller databases with appropriate routing keys, multi‑threaded scanning and result aggregation, or syncing data to Elasticsearch for multi‑dimensional queries.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Full-Stack Internet Architecture
Introducing full-stack Internet architecture technologies centered on Java
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
