Why Your MySQL Queries Are Slow and How ElasticSearch & HBase Can Help
This article analyzes common causes of slow MySQL queries such as index misuse, MDL locks, and large‑table bottlenecks, then presents practical solutions like proper indexing, sharding, read/write splitting, and evaluates when to complement MySQL with ElasticSearch or HBase for better performance.
MySQL Slow Query Causes and Mitigation
Read‑heavy internet applications require fast query response. Common sources of slow queries include index misuse, metadata locks (MDL), flush waits, row locks, and inefficiencies on very large tables.
Index Issues
MySQL indexes are B+‑tree structures. The optimizer can only use an index when the query predicates match the left‑most prefix of a composite index and do not break the ordered nature of the tree.
Using !=, <>, OR, or functions on indexed columns (e.g., WHERE LENGTH(col)=6)
LIKE patterns that start with % Missing quotes around string literals
Low‑cardinality columns (e.g., gender) where a single value covers ~30% of rows
Not matching the left‑most prefix of a composite index
Index Best Practices
Index push‑down : combine conditions so the optimizer can filter within the index.
Covering index : include all columns needed by the query in the index to avoid a table lookup.
Prefix index : index only the first N characters of long strings.
Avoid functions or implicit type/charset conversions on indexed columns.
Consider maintenance cost for columns that are frequently updated.
Diagnosing Wrong Index Choice
Run EXPLAIN to view the chosen execution plan. If the plan uses an unexpected index, refresh statistics with ANALYZE TABLE tbl_name or force a specific index using FORCE INDEX (idx_name).
Metadata Locks (MDL)
Since MySQL 5.5, every CRUD statement acquires a read MDL lock, while DDL statements acquire a write MDL lock. Use SHOW PROCESSLIST to identify sessions waiting for Waiting for table metadata lock.
Flush Waits
Flush commands can be blocked by other statements. SHOW PROCESSLIST shows Waiting for table flush when this occurs.
Row Locks
An uncommitted transaction that holds a write lock blocks other sessions from accessing the same rows.
Current Read (Repeatable‑Read Isolation)
InnoDB’s default isolation level presents a consistent snapshot by applying undo logs until the point where the reading transaction started.
Large‑Table Strategies
When tables contain billions of rows, even well‑indexed queries may hit I/O or CPU bottlenecks. Two common remedies are sharding and read/write splitting.
Sharding
Vertical sharding : split tables or databases by functional domains to alleviate I/O pressure.
Horizontal sharding : split a single logical table into multiple physical tables based on a sharding key to reduce CPU load.
Tools such as Sharding‑Sphere, TDDL, and Mycat can implement these patterns. Key steps include choosing a sharding key, defining routing rules, migrating data, and planning for future scaling.
Read/Write Splitting
When reads far exceed writes, a master‑slave (primary‑replica) topology offloads read traffic to replicas, improving scalability and availability. Be aware of replication lag (stale reads) and routing logic (application‑side or middleware).
ElasticSearch Overview and Performance Characteristics
Basic Query Example
GET yourIndex/_search
{
"from": 0,
"size": 10,
"query": {
"match_phrase": {"log": "xxx"}
}
}The match_phrase query returns documents that contain the exact phrase.
Architecture and Mapping
Before ES 7.0 the hierarchy was index → type → document ; after 7.0 the type layer was removed, making an index analogous to a relational table. Mapping defines field types (e.g., text vs keyword), while settings control the number of primary shards and replicas.
Why ElasticSearch Is Fast
ES builds an inverted index consisting of a Term Dictionary and an in‑memory Term Index (implemented as a Finite State Transducer, FST). The Term Index stores term prefixes in RAM, allowing O(1) lookup of the term’s offset in the on‑disk dictionary, which dramatically reduces random disk I/O.
When to Use ElasticSearch
Full‑text search where MySQL LIKE '%term%' would require a full table scan.
Hybrid architectures: store searchable fields in ES and the full record in MySQL, joining on the document ID.
Write‑heavy workloads that benefit from a separate NoSQL store (e.g., ES + HBase) for massive ingestion.
HBase Storage Model
Column‑Family Layout
HBase stores data by column families . Each row is identified by a row key (sorted lexicographically). Cells are versioned by timestamp, and columns within a family can be added dynamically.
RowKey Design
HBase supports only three query patterns: get by row key, range scan by row key, and full table scan. Therefore, the row key must be designed to enable the required access patterns (e.g., prefixing with a date or region code) and to avoid hotspotting.
Use Cases
Write‑intensive scenarios where low latency inserts are critical.
Sparse, wide tables where many columns are optional.
Applications that can tolerate the lack of multi‑row transactions and complex joins.
Conclusion
Effective performance tuning for data‑intensive applications combines proper MySQL indexing, sharding, and read/write splitting with specialized search engines such as ElasticSearch for full‑text queries, and column‑oriented stores like HBase for write‑heavy workloads. Understanding the underlying mechanisms—B+‑tree index behavior, MDL locking, inverted index structures, and row‑key access patterns—allows engineers to choose the right tool and configuration for each workload.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Code Ape Tech Column
Former Ant Group P8 engineer, pure technologist, sharing full‑stack Java, job interview and career advice through a column. Site: java-family.cn
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
