Databases 20 min read

Why Your MySQL Queries Are Slow and How to Fix Them with Indexes, ES, and HBase

This article analyzes common causes of slow MySQL queries—especially index misuse—offers practical indexing techniques, explains MDL locks and large‑table bottlenecks, and then compares ElasticSearch and HBase as complementary solutions for high‑performance search and storage.

Code Ape Tech Column

Feb 18, 2021

Why Your MySQL Queries Are Slow and How to Fix Them with Indexes, ES, and HBase

MySQL Slow Query Causes and Index Optimization

Read‑heavy internet applications rely on fast query execution. Most slow queries stem from index misuse or missing indexes.

Common Index Failure Reasons

Using !=, <>, OR, functions, or expressions on indexed columns in the WHERE clause.

LIKE patterns that start with a leading wildcard ( %).

Omitting quotes around string literals.

Low‑selectivity columns (e.g., gender) that filter too few rows.

Not matching the leftmost prefix of a composite index.

Why These Patterns Break Index Usage

Functions or expressions (e.g., WHERE LENGTH(col)=6) force MySQL to evaluate the expression before it can traverse the B+‑tree, so the optimizer cannot use the index. Implicit type or charset conversion can also disrupt the sorted order required by the index. A leading‑wildcard LIKE ( LIKE '%abc%') destroys the ability to perform a range scan, causing a full table scan.

Low‑Selectivity Indexes

Indexes on columns with very few distinct values often degrade performance because the engine still needs to read many rows; a full table scan can be cheaper.

Practical Index Best Practices

Index push‑down : create composite indexes so that MySQL can evaluate multiple predicates inside the index.

Covering index : include all columns required by the query in the index to avoid a table lookup.

Prefix index : for long VARCHAR columns, index only the first N characters.

Avoid functions on indexed columns.

Consider maintenance cost for write‑heavy tables; each index adds overhead on INSERT/UPDATE/DELETE.

Diagnosing Wrong Index Choice

Run EXPLAIN to see which index MySQL selects. If the chosen index is sub‑optimal, you can:

Refresh statistics with ANALYZE TABLE tbl_name.

Force a specific index using FORCE INDEX (idx_name).

Metadata Locks (MDL)

Since MySQL 5.5, DDL statements acquire a metadata lock. A write lock blocks read locks. Use SHOW PROCESSLIST and look for the state “Waiting for table metadata lock” to identify blocking sessions.

Flush Wait

Flush commands (e.g., FLUSH TABLES) can be blocked by other statements. The waiting state appears as “Waiting for table flush” in SHOW PROCESSLIST.

Row Locks

Uncommitted write transactions hold row locks, causing other sessions to wait until the transaction commits or rolls back.

Repeatable‑Read Isolation (InnoDB Default)

Each transaction reads a consistent snapshot. When a concurrent transaction commits, the reading transaction applies undo logs to reconstruct the view as of its start time.

Large‑Table Considerations

In tables with billions of rows, even well‑indexed queries may hit I/O or CPU limits. InnoDB stores B+‑tree nodes of 16 KB, typically three levels deep. Under heavy load the buffer pool may evict hot pages, reducing cache hit rate.

Two common mitigation strategies:

Sharding (horizontal or vertical) : split data across multiple databases or tables based on a shard key. Tools such as Sharding‑Sphere, TDDL, and Mycat assist with rule definition, data migration, and scaling.

Read/Write Splitting : use a master‑slave (primary‑replica) topology to offload read traffic to replicas, improving scalability and availability.

ElasticSearch Overview

ElasticSearch (ES) is a Lucene‑based near‑real‑time distributed search engine. It excels at full‑text search, log aggregation (ELK stack), and JSON document storage.

Structure Changes

Before ES 7.0 the hierarchy was index → type → document. Since 7.0 the type layer was removed, making index analogous to a table.

Why ES Queries Are Fast

ES builds an inverted index: each term maps to a posting list of document IDs. A term dictionary (stored on disk) is complemented by an in‑memory Finite State Transducer (FST) term index, allowing rapid location of the dictionary entry without costly random I/O.

Example Search Request

GET yourIndex/_search
{
  "from": 0,
  "size": 10,
  "query": {
    "match_phrase": {
      "log": "xxx"
    }
  }
}

This request performs a phrase match, returning documents that contain the exact sequence of terms.

Cluster Inspection Commands

GET /_cat/health?v&pretty

– cluster health. GET /_cat/shards?v – shard allocation. GET yourindex/_mapping – mapping (schema) definition. GET yourindex/_settings – index settings (shard count, replicas, etc.). GET /_cat/indices?v – list all indices on the node.

Mapping Example (partial)

"appname": {
  "type": "text",
  "fields": {
    "keyword": {
      "type": "keyword",
      "ignore_above": 256
    }
  }
}

Text fields are analyzed (tokenized) for full‑text search, while the keyword sub‑field stores the exact value for term‑level queries.

When to Use ES

Full‑text search : fuzzy, phrase, or proximity queries on large text fields (e.g., chat message search).

Combined queries : store searchable fields and document IDs in ES, keep the full record in MySQL; query ES first, then fetch details from MySQL.

Hybrid architectures : use ES for search and a write‑optimized store such as HBase for massive ingestion, linking records via a common key.

HBase Basics

Storage Model

HBase is a column‑family NoSQL store. Rows are identified by a lexicographically ordered row key . Each column family (e.g., info, area) groups related columns, which can be added dynamically.

OLTP vs OLAP

Row‑oriented databases excel at OLTP (transactional) workloads, while column‑oriented stores are suited for OLAP (analytical) queries. HBase is optimized for write‑heavy OLTP scenarios but is not a full‑featured OLAP engine.

RowKey Design

HBase supports only three query patterns: single‑row lookup by row key, range scans on row keys, and full table scans. Therefore, a well‑designed row key (e.g., prefixing with a region identifier, timestamp, or hash) is critical for performance and data distribution.

Typical Use Cases

HBase shines in write‑intensive applications that require fast ingestion and low‑latency reads for single rows or small ranges. Complex ad‑hoc analytics are better served by dedicated OLAP systems.

References

https://juejin.im/post/5bfe771251882509a7681b3a

https://wsgzao.github.io/post/elk/

https://www.cnblogs.com/luxiaoxun/p/5452502.html

https://www.ibm.com/developerworks/cn/analytics/library/ba-cn-bigdata-hbase/index.html

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Performance Indexing Elasticsearch MySQL HBase database optimization

Written by

Code Ape Tech Column

Former Ant Group P8 engineer, pure technologist, sharing full‑stack Java, job interview and career advice through a column. Site: java-family.cn

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.