Databases 21 min read

Why Your MySQL Queries Are Slow and How ElasticSearch & HBase Can Help

This article examines common causes of slow MySQL queries, explains index mechanics and failures, then compares ElasticSearch’s fast tokenized search and HBase’s column‑oriented storage, offering practical guidance on when and how to use each technology.

21CTO

Jul 18, 2021

Why Your MySQL Queries Are Slow and How ElasticSearch & HBase Can Help

1. MySQL Slow Query Experience

Most internet applications are read‑heavy, so query speed matters. Slow queries often stem from index issues, MDL locks, flush waits, row locks, or large‑table bottlenecks.

1.1 Index

MySQL indexes are B+ trees. Proper use of left‑most prefix and composite indexes can dramatically improve performance through index push‑down and covering indexes.

1.1.1 Reasons for Index Failure

Using !=, <>, OR, functions, or expressions on indexed columns.

LIKE patterns that start with %.

String literals without quotes.

Low‑cardinality columns (e.g., gender).

Not matching the left‑most prefix of a composite index.

1.1.2 Why These Cause Failure

MySQL may deem such operations as breaking index order, so the optimizer skips the index. Implicit type conversion and charset conversion also lead to index loss.

Function Operations

When a function is applied to an indexed column, e.g., WHERE length(a) = 6, the optimizer cannot use the index because the value is transformed before lookup.

Implicit Conversion

Implicit type and charset conversions can also cause index loss.

JOOQ rarely encounters implicit type conversion.

Implicit charset conversion may appear in join queries when column types match but charsets differ.

Breaking Order

LIKE patterns with a leading % or unquoted strings can disrupt index ordering, causing the optimizer to ignore the index.

1.1.3 Why Not Index Low‑Selectivity Fields (e.g., gender)

Indexing low‑cardinality fields yields little benefit and may require a full table scan.

For non‑clustered indexes, scanning a large number of rows after using such an index can be slower than a full scan.

1.1.4 Simple Index Techniques

Index push‑down: use composite indexes to filter rows early.

Covering index: keep all needed columns in the index to avoid table lookups.

Prefix index: index only the first N characters of a string.

Avoid functions on indexed columns.

Consider maintenance cost for write‑heavy tables.

1.1.5 Evaluating Wrong Index Choice

If a query uses an unexpected low‑selectivity index, run ANALYZE TABLE to refresh statistics or use FORCE INDEX to guide the optimizer.

1.2 MDL Locks

MySQL 5.5 introduced metadata locks (MDL). DML acquires a read MDL lock; DDL acquires a write MDL lock. Write locks block read locks. Use SHOW PROCESSLIST to see sessions waiting for "Waiting for table metadata lock".

1.3 Flush

Flush commands can be blocked by other statements. Use SHOW PROCESSLIST to detect sessions in "Waiting for table flush" state.

1.4 Row Locks

Row locks occur when a transaction holds a write lock without committing.

1.5 Current Read

InnoDB's default isolation is REPEATABLE READ. A transaction may need to apply undo logs to see a consistent snapshot.

1.6 Large‑Table Scenarios

For tables with billions of rows, even optimized indexes may hit I/O or CPU bottlenecks. Common solutions are sharding (horizontal or vertical) and read/write separation.

1.6.1 Sharding

Solution

If I/O is the bottleneck, use vertical sharding (split by database or column family). If CPU is the bottleneck, use horizontal sharding (split rows across tables).

Horizontal sharding distributes data across many tables; vertical sharding splits by business domain or column family.

Problems

Challenges include unique ID generation, non‑partition‑key queries, and scaling the shard key.

IDs can be auto‑increment, Snowflake, segment, or GUID.

Non‑partition‑key queries often require mapping tables or secondary indexes.

Scaling may require range‑plus‑modulo sharding to minimize data migration.

1.6.2 Read/Write Separation

Why

When read traffic far exceeds write traffic, a master‑slave setup can distribute reads, improve availability, and balance load.

Problems

Stale reads due to replication lag.

Routing logic to decide whether a query goes to master or slave.

1.7 Summary

The above lists common MySQL slow‑query causes and mitigation methods, and introduces typical solutions for large‑data scenarios.

2. How to Evaluate ElasticSearch

ElasticSearch (ES) is a Lucene‑based near‑real‑time distributed search engine used for full‑text search, NoSQL JSON storage, log monitoring, and data analytics.

2.1 What ES Can Do

Typical use cases include full‑text search, log analysis (often with Logstash and Kibana as the ELK stack), and serving as a secondary index for MySQL.

2.2 ES Structure

Before ES 7.0 the hierarchy was Index → Type → Document (similar to database → table → row). Types were removed in 7.0; now an index is analogous to a table.

Key components are mapping (schema) and settings (shard and replica counts). Example mapping shows a text field with a sub‑field keyword for exact matches.

2.3 Why ES Queries Are Fast

ES uses inverted indexes. Terms are indexed, and document IDs are stored in posting lists. A memory‑resident Term Index (FST) points to the on‑disk Term Dictionary, enabling rapid term lookup.

2.3.1 Tokenized Search

Because terms are tokenized, a query like "Ada" can be located directly without scanning the entire dataset.

2.3.2 Exact Search

For exact matches, the advantage of the Term Index disappears, and MySQL's covering index may be faster.

2.4 When to Use ES

2.4.1 Full‑Text Search

MySQL fuzzy string queries are costly; ES handles them efficiently, e.g., searching chat logs.

Tokenization

Chinese requires a dedicated analyzer (e.g., IK) because the default analyzer tokenizes by whitespace.

POST yourindex/_analyze
{
  "field": "yourfield",
  "text": "我可真是个机灵鬼"
}

2.4.2 Combined Queries

ES + MySQL

Store searchable fields and IDs in ES for fast full‑text search, while keeping the full record in MySQL for transactional integrity.

ES + HBASE

For massive write‑heavy workloads, replace MySQL with a distributed store like HBASE, using ES as the search layer.

2.5 Summary

ES is fast because of its in‑memory Term Index and inverted index design, making it ideal for full‑text and tokenized searches, while relational databases excel at exact lookups and transactional workloads.

3. HBASE

3.1 Storage Structure

Unlike row‑oriented relational databases, HBASE stores data by column family. Example tables illustrate the difference.

Name

School

XX Elementary

YY Middle School

Each row has a row key (sorted lexicographically), a timestamp version, column families (e.g., info, area), and dynamic columns within families.

3.2 OLTP and OLAP

Row‑oriented databases suit OLTP (transactional) workloads; column‑oriented stores suit OLAP (analytical) workloads. HBASE is column‑oriented but does not provide full OLAP capabilities.

3.3 RowKey

HBASE only supports three query patterns: single row by RowKey, range scan by RowKey, and full table scan. Good RowKey design is crucial.

3.4 Use Cases

HBASE excels at write‑intensive scenarios with high throughput and strong reliability, while reads are efficient for single‑row or small‑range queries.

4. Conclusion

Software development should be incremental; technology must serve the project, and suitability outweighs novelty. To speed up queries, first eliminate bugs, then apply appropriate indexing, sharding, or search‑engine solutions.

References

Designing high‑concurrency architectures for ten‑thousand‑queries‑per‑second systems.

Building a centralized log analysis platform with ELK.

Comparative analysis of MySQL and Lucene indexes.

In‑depth introduction to HBASE.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Big Data Indexing mysql HBase Database Performance

Written by

21CTO

21CTO (21CTO.com) offers developers community, training, and services, making it your go‑to learning and service platform.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.