Why MySQL Queries Slow Down and How ES & HBase Can Help Optimize
This article explores common causes of MySQL slow queries such as index misuse and lock contention, explains indexing strategies like index pushdown and covering indexes, and then compares Elasticsearch and HBase as complementary solutions for large‑scale search and write‑intensive workloads, offering practical tips for performance optimization.
1. MySQL Query Slow Experience
Most internet applications are read‑heavy and write‑light, so fast reads are essential. Various factors can cause slow queries.
1.1 Index
When data volume is modest, many slow queries can be solved with proper indexes, but improper indexes are also a major cause.
MySQL indexes are based on B+ trees; interviewees often mention left‑most prefix indexes, B+ trees, and other tree structures.
Left‑most prefix refers to the usage rule of composite indexes. Proper composite indexes improve query speed because of index pushdown: if the query condition is covered by a composite index (a,b), MySQL can evaluate b directly within the index after matching a, reducing table lookups.
If the queried columns are fully contained in a composite index, it becomes a covering index, eliminating the need for a table lookup.
1.1.1 Reasons for Index Failure
Indexes may be built but still cause slow queries due to index failure (not used). Common reasons include:
Using !=, <>, OR, or functions in the WHERE clause
LIKE patterns that start with %
String literals without quotes
Low‑cardinality fields (e.g., gender)
Not matching the left‑most prefix
1.1.2 Why These Cause Index Failure
Functions on indexed columns (e.g., WHERE LENGTH(a)=6) prevent index usage because the index cannot evaluate the function efficiently.
Implicit type or character‑set conversions also break index ordering, causing MySQL to skip the index.
1.1.3 Why Low‑Cardinality Fields Like Gender Should Not Be Indexed
For non‑clustered indexes, a low‑cardinality field like gender leads to many row lookups after index scan, often worse than a full table scan. InnoDB may abandon such indexes when the field accounts for about 30% of rows.
1.1.4 Simple and Effective Indexing Practices
Index pushdown: use composite indexes for multi‑condition queries.
Covering index: keep all needed columns in the index to avoid table lookups.
Prefix index: index only the first N characters of a string.
Avoid functions on indexed columns.
1.1.5 Evaluating Wrong Index Choices
Sometimes an index looks correct but MySQL picks a low‑selectivity one, causing excessive scans. Causes include inaccurate statistics (use ANALYZE TABLE) and optimizer mis‑prediction (use FORCE INDEX or rewrite queries).
1.2 MDL Locks
MySQL 5.5 introduced metadata locks (MDL). CRUD operations acquire a read MDL; schema changes acquire a write MDL. Read and write MDLs are mutually exclusive.
1.3 Flush Waits
Flush commands can be blocked by other statements, causing queries to wait for table flush.
1.4 Row Locks
Uncommitted write locks can cause other transactions to wait.
1.5 Current Read
InnoDB default isolation is REPEATABLE READ. A transaction may need to apply undo logs to see a consistent snapshot.
1.6 Large Table Scenarios
Tables with billions of rows face I/O or CPU bottlenecks even with good indexing. InnoDB buffer pool size and LRU eviction affect cache hit rates.
1.6.1 Sharding (Database & Table Partitioning)
Choose vertical sharding (different databases) for I/O bottlenecks, or horizontal sharding (different tables) for CPU bottlenecks. Tools include Sharding‑Sphere, TDDL, Mycat.
1.6.2 Read‑Write Splitting
When read traffic far exceeds write traffic, use master‑slave replication to distribute reads, improving load balance and availability.
1.7 Summary of MySQL Section
The above lists common MySQL slow‑query causes and mitigation methods, and introduces scaling techniques for large‑data scenarios.
2. How to Evaluate Elasticsearch
Elasticsearch (ES) is a near‑real‑time distributed search engine built on Lucene, suitable for full‑text search, JSON document storage, log monitoring, and data analytics.
2.1 What ES Can Do
Typical use cases include full‑text search, log analysis (often with the ELK stack), and NoSQL document storage.
2.2 ES Structure
Before ES 7.0 the hierarchy was Index → Type → Document; after 7.0, Type is removed, making Index analogous to a table.
Key components are mapping (schema) and settings (shard and replica configuration).
2.3 Why ES Queries Are Fast
ES uses inverted indexes: terms are indexed, and document IDs are stored in posting lists. A term dictionary and an in‑memory term index (FST) enable rapid term lookup.
For tokenized searches, ES can locate matching documents without scanning the entire dataset, unlike MySQL's %pattern scans.
2.3.1 Tokenized Search
ES stores tokenized terms, allowing fast lookup of words like "Ada" without full table scans.
2.3.2 Exact Search
When exact match is needed, ES may not have a clear advantage over MySQL covering indexes.
2.4 When to Use ES
2.4.1 Full‑Text Search
For fuzzy or phrase searches on large text fields (e.g., chat logs), ES excels compared to MySQL's LIKE queries.
2.4.2 Combined Queries
Large datasets may benefit from a hybrid approach: store searchable fields in ES (with IDs) and keep full records in MySQL, or combine ES with HBase for massive write‑heavy workloads.
2.5 ES Summary
ES achieves speed through tokenization, inverted indexes, and in‑memory term indexes, making it ideal for full‑text and log search scenarios.
3. HBASE Overview
HBase is a column‑family NoSQL store built on Hadoop, optimized for write‑intensive workloads.
3.1 Storage Model
Data is stored by row key (sorted lexicographically) with timestamps as versions. Column families group related columns, and cells hold the actual values.
3.2 OLTP vs OLAP
HBase is not designed for OLAP; it lacks transactions and is column‑family oriented, making it suitable for OLTP‑like write‑heavy use cases.
3.3 RowKey Design
HBase supports only three query patterns: single row lookup by rowkey, range scans by rowkey, and full table scans. Good rowkey design is critical.
3.4 Use Cases
HBase shines in write‑dense scenarios where fast ingestion is required, while reads are efficient for single‑row or small range queries.
4. Overall Summary
Software development should prioritize appropriate, incremental solutions over flashy complexity. To achieve fast queries, first identify and fix bugs, then consider architectural enhancements such as proper indexing, sharding, read‑write splitting, or integrating specialized systems like Elasticsearch and HBase.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
MaGe Linux Operations
Founded in 2009, MaGe Education is a top Chinese high‑end IT training brand. Its graduates earn 12K+ RMB salaries, and the school has trained tens of thousands of students. It offers high‑pay courses in Linux cloud operations, Python full‑stack, automation, data analysis, AI, and Go high‑concurrency architecture. Thanks to quality courses and a solid reputation, it has talent partnerships with numerous internet firms.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
