Why Your MySQL Queries Are Slow and How ElasticSearch & HBase Can Help
This article examines common causes of slow MySQL queries, explains index mechanics and failures, then compares ElasticSearch’s fast tokenized search and HBase’s column‑oriented storage, offering practical guidance on when and how to use each technology.
1. MySQL Slow Query Experience
Most internet applications are read‑heavy, so query speed matters. Slow queries often stem from index issues, MDL locks, flush waits, row locks, or large‑table bottlenecks.
1.1 Index
MySQL indexes are B+ trees. Proper use of left‑most prefix and composite indexes can dramatically improve performance through index push‑down and covering indexes.
1.1.1 Reasons for Index Failure
Using !=, <>, OR, functions, or expressions on indexed columns.
LIKE patterns that start with %.
String literals without quotes.
Low‑cardinality columns (e.g., gender).
Not matching the left‑most prefix of a composite index.
1.1.2 Why These Cause Failure
MySQL may deem such operations as breaking index order, so the optimizer skips the index. Implicit type conversion and charset conversion also lead to index loss.
Function Operations
When a function is applied to an indexed column, e.g., WHERE length(a) = 6, the optimizer cannot use the index because the value is transformed before lookup.
Implicit Conversion
Implicit type and charset conversions can also cause index loss.
JOOQ rarely encounters implicit type conversion.
Implicit charset conversion may appear in join queries when column types match but charsets differ.
Breaking Order
LIKE patterns with a leading % or unquoted strings can disrupt index ordering, causing the optimizer to ignore the index.
1.1.3 Why Not Index Low‑Selectivity Fields (e.g., gender)
Indexing low‑cardinality fields yields little benefit and may require a full table scan.
For non‑clustered indexes, scanning a large number of rows after using such an index can be slower than a full scan.
1.1.4 Simple Index Techniques
Index push‑down: use composite indexes to filter rows early.
Covering index: keep all needed columns in the index to avoid table lookups.
Prefix index: index only the first N characters of a string.
Avoid functions on indexed columns.
Consider maintenance cost for write‑heavy tables.
1.1.5 Evaluating Wrong Index Choice
If a query uses an unexpected low‑selectivity index, run ANALYZE TABLE to refresh statistics or use FORCE INDEX to guide the optimizer.
1.2 MDL Locks
MySQL 5.5 introduced metadata locks (MDL). DML acquires a read MDL lock; DDL acquires a write MDL lock. Write locks block read locks. Use SHOW PROCESSLIST to see sessions waiting for "Waiting for table metadata lock".
1.3 Flush
Flush commands can be blocked by other statements. Use SHOW PROCESSLIST to detect sessions in "Waiting for table flush" state.
1.4 Row Locks
Row locks occur when a transaction holds a write lock without committing.
1.5 Current Read
InnoDB's default isolation is REPEATABLE READ. A transaction may need to apply undo logs to see a consistent snapshot.
1.6 Large‑Table Scenarios
For tables with billions of rows, even optimized indexes may hit I/O or CPU bottlenecks. Common solutions are sharding (horizontal or vertical) and read/write separation.
1.6.1 Sharding
Solution
If I/O is the bottleneck, use vertical sharding (split by database or column family). If CPU is the bottleneck, use horizontal sharding (split rows across tables).
Horizontal sharding distributes data across many tables; vertical sharding splits by business domain or column family.
Problems
Challenges include unique ID generation, non‑partition‑key queries, and scaling the shard key.
IDs can be auto‑increment, Snowflake, segment, or GUID.
Non‑partition‑key queries often require mapping tables or secondary indexes.
Scaling may require range‑plus‑modulo sharding to minimize data migration.
1.6.2 Read/Write Separation
Why
When read traffic far exceeds write traffic, a master‑slave setup can distribute reads, improve availability, and balance load.
Problems
Stale reads due to replication lag.
Routing logic to decide whether a query goes to master or slave.
1.7 Summary
The above lists common MySQL slow‑query causes and mitigation methods, and introduces typical solutions for large‑data scenarios.
2. How to Evaluate ElasticSearch
ElasticSearch (ES) is a Lucene‑based near‑real‑time distributed search engine used for full‑text search, NoSQL JSON storage, log monitoring, and data analytics.
2.1 What ES Can Do
Typical use cases include full‑text search, log analysis (often with Logstash and Kibana as the ELK stack), and serving as a secondary index for MySQL.
2.2 ES Structure
Before ES 7.0 the hierarchy was Index → Type → Document (similar to database → table → row). Types were removed in 7.0; now an index is analogous to a table.
Key components are mapping (schema) and settings (shard and replica counts). Example mapping shows a text field with a sub‑field keyword for exact matches.
2.3 Why ES Queries Are Fast
ES uses inverted indexes. Terms are indexed, and document IDs are stored in posting lists. A memory‑resident Term Index (FST) points to the on‑disk Term Dictionary, enabling rapid term lookup.
2.3.1 Tokenized Search
Because terms are tokenized, a query like "Ada" can be located directly without scanning the entire dataset.
2.3.2 Exact Search
For exact matches, the advantage of the Term Index disappears, and MySQL's covering index may be faster.
2.4 When to Use ES
2.4.1 Full‑Text Search
MySQL fuzzy string queries are costly; ES handles them efficiently, e.g., searching chat logs.
Tokenization
Chinese requires a dedicated analyzer (e.g., IK) because the default analyzer tokenizes by whitespace.
POST yourindex/_analyze
{
"field": "yourfield",
"text": "我可真是个机灵鬼"
}2.4.2 Combined Queries
ES + MySQL
Store searchable fields and IDs in ES for fast full‑text search, while keeping the full record in MySQL for transactional integrity.
ES + HBASE
For massive write‑heavy workloads, replace MySQL with a distributed store like HBASE, using ES as the search layer.
2.5 Summary
ES is fast because of its in‑memory Term Index and inverted index design, making it ideal for full‑text and tokenized searches, while relational databases excel at exact lookups and transactional workloads.
3. HBASE
3.1 Storage Structure
Unlike row‑oriented relational databases, HBASE stores data by column family. Example tables illustrate the difference.
Name
School
Li
XX Elementary
Li
YY Middle School
Each row has a row key (sorted lexicographically), a timestamp version, column families (e.g., info, area), and dynamic columns within families.
3.2 OLTP and OLAP
Row‑oriented databases suit OLTP (transactional) workloads; column‑oriented stores suit OLAP (analytical) workloads. HBASE is column‑oriented but does not provide full OLAP capabilities.
3.3 RowKey
HBASE only supports three query patterns: single row by RowKey, range scan by RowKey, and full table scan. Good RowKey design is crucial.
3.4 Use Cases
HBASE excels at write‑intensive scenarios with high throughput and strong reliability, while reads are efficient for single‑row or small‑range queries.
4. Conclusion
Software development should be incremental; technology must serve the project, and suitability outweighs novelty. To speed up queries, first eliminate bugs, then apply appropriate indexing, sharding, or search‑engine solutions.
References
Designing high‑concurrency architectures for ten‑thousand‑queries‑per‑second systems.
Building a centralized log analysis platform with ELK.
Comparative analysis of MySQL and Lucene indexes.
In‑depth introduction to HBASE.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
21CTO
21CTO (21CTO.com) offers developers community, training, and services, making it your go‑to learning and service platform.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
