Databases 52 min read

Mastering MySQL: Keys, Indexes, Transactions, Storage Engines, and Optimization Explained

This comprehensive guide covers MySQL fundamentals—including primary, foreign, candidate, and super keys—auto‑increment primary keys, triggers, stored procedures, views, cursor usage, index types and design, transaction properties and isolation levels, storage engine differences, query execution order, EXPLAIN analysis, lock mechanisms, replication strategies, high‑concurrency solutions, and crash‑recovery via REDO and UNDO logs, providing practical examples and code snippets for each concept.

ITPUB

Aug 31, 2020

Mastering MySQL: Keys, Indexes, Transactions, Storage Engines, and Optimization Explained

Basic Concepts

Primary, Foreign, Super, and Candidate Keys

Super key: Any attribute set that uniquely identifies a tuple in a relation; includes candidate and primary keys.

Candidate key: Minimal super key without redundant attributes.

Primary key: A column or column combination that uniquely and completely identifies each row; cannot be NULL.

Foreign key: A column that references the primary key of another table.

Why Use Auto‑Increment Columns as Primary Keys

When a PRIMARY KEY is defined, InnoDB uses it as the clustered index. If no explicit primary key exists, InnoDB selects the first unique index without NULL values; otherwise it creates a hidden 6‑byte ROWID.

Data rows are stored in the leaf nodes of a B+Tree ordered by the primary key. Inserting a new row with an auto‑increment key appends it to the end of the current index page, minimizing page splits and fragmentation. Non‑auto‑increment keys cause random inserts, leading to page splits, data movement, and the need for OPTIMIZE TABLE to rebuild pages.

Triggers

Triggers are special stored procedures executed automatically by events. They enforce constraints, maintain data integrity, track operations, and can cascade actions across tables.

Stored Procedures

A stored procedure is a pre‑compiled set of SQL statements that can be invoked multiple times, improving modularity and performance.

Calling methods:

Use a command object to execute the procedure.

Invoke from external programs such as Java.

Procedure Advantages and Disadvantages

Advantages:

Pre‑compiled for high execution efficiency.

Reduces network traffic by executing on the server.

Provides security through permission control.

Reusable, reducing development effort.

Disadvantages:

Poor portability across different DBMS.

Views and Cursors

View: A virtual table that presents data from one or more base tables; updates to a view affect the underlying tables.

Cursor: Allows row‑by‑row processing of a result set, useful when set‑based operations are insufficient.

Indexes

What Is an Index?

An index is a sorted data structure, typically a B‑Tree or B+Tree, that speeds up data retrieval by providing a fast lookup path.

Benefits and Drawbacks

Benefits: Faster queries, unique constraints, quicker joins, efficient grouping and sorting, and optimizer assistance.

Drawbacks: Additional storage space, slower inserts/updates/deletes, and maintenance overhead.

When to Create Indexes

Columns frequently used in search conditions.

Primary key columns.

Foreign key columns used in joins.

Columns used in range queries, sorting, or GROUP BY.

Columns appearing often in WHERE clauses.

Do not index: Rarely used columns, low‑cardinality columns (e.g., gender), TEXT/IMAGE/BIT columns, or when write performance outweighs read performance.

Index Types

B‑Tree vs. B+Tree: B+Tree stores all keys in leaf nodes and links leaves for sequential access, making range scans more efficient.

Clustered vs. Non‑Clustered Indexes: Clustered indexes store rows in primary key order; non‑clustered indexes store pointers to rows.

Hash vs. B+Tree (InnoDB): Hash indexes provide O(1) equality lookups but cannot support range queries, ordering, or prefix scans. B+Tree indexes support all these operations and are generally preferred.

Transactions

Definition and ACID Properties

A transaction groups multiple operations into a single unit that can be committed or rolled back, ensuring Atomicity, Consistency, Isolation, and Durability.

Isolation Levels and Concurrency Issues

Read Uncommitted: Allows dirty reads.

Read Committed: Prevents dirty reads but allows non‑repeatable reads.

Repeatable Read (MySQL default): Prevents non‑repeatable reads; may still have phantom reads.

Serializable: Highest isolation, eliminates all concurrency anomalies.

Common concurrency problems include dirty reads, non‑repeatable reads, and phantom reads.

Transaction Propagation Behaviors (Spring‑style)

PROPAGATION_REQUIRED

PROPAGATION_SUPPORTS

PROPAGATION_MANDATORY

PROPAGATION_REQUIRES_NEW

PROPAGATION_NOT_SUPPORTED

PROPAGATION_NEVER

PROPAGATION_NESTED

Nested Transactions

Nested transactions use savepoints; a child rollback restores to the savepoint without affecting the outer transaction, while an outer rollback aborts the entire transaction.

Storage Engines

InnoDB, MyISAM, and MEMORY

InnoDB: Supports transactions, row‑level locking, foreign keys, and crash recovery; default engine since MySQL 5.5.

MyISAM: Table‑level locking, no transaction support, faster for read‑heavy workloads, supports full‑text indexes.

MEMORY: Stores data in RAM for ultra‑fast access; data is lost on restart; uses hash indexes by default.

Choosing Between InnoDB and MyISAM

Use InnoDB for write‑intensive, high‑integrity applications; use MyISAM for read‑heavy, simple queries where transaction support is unnecessary.

SQL Optimization

Query Execution Order

The logical order is FROM → WHERE → GROUP BY → HAVING → SELECT → ORDER BY, which differs from the written order.

Using EXPLAIN

EXPLAIN provides details such as table, access type (type), possible keys, used key, key length, rows examined, and extra information (e.g., using index, filesort, temporary tables).

Slow Query Diagnosis

Enable slow_query_log, set slow_query_log_file, and define long_query_time to capture inefficient statements.

Locking Mechanisms

Lock Granularity

Table‑level lock: low overhead, no deadlocks, but poor concurrency.

Row‑level lock: higher overhead, possible deadlocks, best concurrency.

Page‑level lock: intermediate overhead and concurrency.

Deadlock Detection and Resolution

Deadlocks occur when sessions acquire locks in different orders. Resolve by killing one victim session (e.g.,

SELECT trx_mysql_thread_id FROM information_schema.innodb_trx;

) or by setting a lock timeout ( innodb_lock_wait_timeout).

Pessimistic vs. Optimistic Locks

Pessimistic lock: Use SELECT ... FOR UPDATE within a transaction to acquire row locks before updating.

Optimistic lock: Add a version or timestamp column; update only if the version matches, otherwise retry.

Replication and High Availability

Master‑Slave Replication Modes

Asynchronous: Master returns immediately after writing.

Semi‑synchronous: Master waits for at least one slave to acknowledge receipt.

Synchronous: Master waits for all slaves (rarely used).

Read‑Write Splitting

Writes go to the master; reads are distributed among slaves using a proxy or load balancer (e.g., HAProxy).

Scaling Strategies

Vertical scaling: upgrade hardware.

Horizontal scaling: sharding (vertical partitioning) and table partitioning (horizontal).

Use caching layers (e.g., Memcached) to reduce database load.

Crash Recovery

UNDO Log

Records before‑image of modified rows to enable rollback and supports MVCC. Written before data changes are persisted.

REDO Log

Records after‑image of changes; persisted before transaction commit, allowing recovery of committed data after a crash.

References

Images and diagrams referenced in the original article are retained for illustration.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

mysql Indexes sql-optimization transactions Storage Engines Database Fundamentals

Written by

ITPUB

Official ITPUB account sharing technical insights, community news, and exciting events.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.