Understanding Slow Queries and Index Optimization in MySQL, ElasticSearch, and HBase
This article examines common causes of slow queries in MySQL, presents practical indexing techniques, and extends the discussion to performance considerations in ElasticSearch and HBase, offering combined solutions such as sharding, read/write splitting, and using ES as a search layer for large‑scale systems.
The article discusses common causes of slow queries in MySQL and provides practical indexing techniques, then explores related performance considerations in ElasticSearch and HBase, and suggests combined solutions such as sharding, read/write splitting, and using ES as a search layer.
1. MySQL Slow Query Experience
Most internet applications are read‑heavy; slow queries often stem from improper indexes.
1.1 Index
Indexes in MySQL are B+ trees; the left‑most prefix rule, index push‑down, and covering indexes are essential for fast lookups.
1.1.1 Causes of Index Failure
Common reasons include using !=, <>, OR, or functions on indexed columns, leading % in LIKE, missing quotes, low cardinality (e.g., gender), and not matching the leftmost prefix.
where uses !=, <>, OR, or a function
LIKE with leading %
String not quoted
Low selectivity (e.g., gender)
Not matching leftmost prefix
Function operations such as where length(a) = 6 prevent index use because the optimizer cannot apply the B+ tree.
Implicit type or charset conversion can also break index order.
1.1.3 Why Not Index Low‑Cardinality Fields
Low‑cardinality fields like gender provide little benefit and may cause full‑table scans.
InnoDB may skip such indexes when the column occupies roughly 30% of rows.
1.1.4 Simple Index Practices
Index push‑down: use composite indexes for multi‑condition queries.
Covering index: keep all needed columns in the index.
Prefix index for long strings.
Avoid functions on indexed columns.
Consider maintenance cost for frequently updated columns.
1.1.5 Evaluating Wrong Index Choice
Use ANALYZE TABLE x to refresh statistics or FORCE INDEX to override the optimizer.
1.2 MDL Locks
MySQL 5.5 introduced metadata locks (MDL). DDL acquires write locks; DML acquires read locks. Use SHOW PROCESSLIST to see “Waiting for table metadata lock”.
1.3 Flush
Flush can be blocked; monitor with SHOW PROCESSLIST for “Waiting for table flush”.
1.4 Row Locks
Uncommitted write locks can block other transactions.
1.5 Current Read
InnoDB default isolation is REPEATABLE READ; undo logs are applied to present a consistent snapshot.
1.6 Large Table Scenarios
For tables with billions of rows, even optimized indexes may hit I/O or CPU bottlenecks. InnoDB buffer pool size and LRU eviction affect cache hit rate.
1.6.1 Sharding
Horizontal sharding distributes rows across tables; vertical sharding splits columns into separate databases. Tools: Sharding‑Sphere, TDDL, Mycat.
1.6.2 Read/Write Splitting
When reads far exceed writes, use master‑slave replication to offload reads.
2. ElasticSearch Overview
ES is a real‑time distributed search engine built on Lucene, suitable for full‑text search, log analysis, and as a NoSQL JSON store.
2.1 What ES Can Do
Typical use cases include keyword search, log monitoring, and analytics.
2.2 ES Architecture
Before 7.0 the hierarchy was index → type → document; now index is analogous to a table. Mapping defines field types; settings control shards and replicas.
Example mapping snippet:
{
"mappings": {
"doc": {
"properties": {
"appname": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
}
}
}
}2.3 Why ES Queries Are Fast
ES uses inverted indexes with a term dictionary and a term index stored in memory (FST), enabling rapid term lookup.
2.3.1 Tokenized Search
Terms are tokenized; a query like “Ada” can be located without full table scans.
2.3.2 Exact Search
When the term‑index advantage disappears, performance may be comparable to MySQL covering indexes.
2.4 When to Use ES
Ideal for full‑text search on large text fields, e.g., chat message logs, where MySQL LIKE is inefficient.
2.4.1 Combined Queries
Store searchable fields in ES and full records in MySQL; retrieve IDs from ES then fetch details from MySQL.
2.4.2 ES + HBase
For write‑intensive workloads, pair ES with a column‑store like HBase.
3. HBase Basics
HBase stores data by column families, not rows. RowKey is the primary key and determines data ordering.
3.1 Storage Model
Unlike row‑oriented MySQL, HBase is column‑oriented, supporting sparse data.
3.2 OLTP vs OLAP
HBase is suited for write‑heavy OLTP scenarios but not for complex analytical queries.
3.3 RowKey Design
Only three query patterns are supported: single row, range scan, and full table scan.
3.4 Use Cases
Best for high‑throughput writes and simple lookups by RowKey; provides high reliability and no single point of failure.
4. Conclusion
The article emphasizes systematic debugging, proper index design, and combining technologies (MySQL, ES, HBase) to achieve fast queries in large‑scale systems.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Top Architect
Top Architect focuses on sharing practical architecture knowledge, covering enterprise, system, website, large‑scale distributed, and high‑availability architectures, plus architecture adjustments using internet technologies. We welcome idea‑driven, sharing‑oriented architects to exchange and learn together.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
