Databases 19 min read

Understanding Slow Queries and Index Optimization in MySQL, ElasticSearch, and HBase

This article examines common causes of slow queries in MySQL, presents practical indexing techniques, and extends the discussion to performance considerations in ElasticSearch and HBase, offering combined solutions such as sharding, read/write splitting, and using ES as a search layer for large‑scale systems.

Top Architect

Jun 28, 2021

Understanding Slow Queries and Index Optimization in MySQL, ElasticSearch, and HBase

The article discusses common causes of slow queries in MySQL and provides practical indexing techniques, then explores related performance considerations in ElasticSearch and HBase, and suggests combined solutions such as sharding, read/write splitting, and using ES as a search layer.

1. MySQL Slow Query Experience

Most internet applications are read‑heavy; slow queries often stem from improper indexes.

1.1 Index

Indexes in MySQL are B+ trees; the left‑most prefix rule, index push‑down, and covering indexes are essential for fast lookups.

1.1.1 Causes of Index Failure

Common reasons include using !=, <>, OR, or functions on indexed columns, leading % in LIKE, missing quotes, low cardinality (e.g., gender), and not matching the leftmost prefix.

where uses !=, <>, OR, or a function

LIKE with leading %

String not quoted

Low selectivity (e.g., gender)

Not matching leftmost prefix

Function operations such as where length(a) = 6 prevent index use because the optimizer cannot apply the B+ tree.

Implicit type or charset conversion can also break index order.

1.1.3 Why Not Index Low‑Cardinality Fields

Low‑cardinality fields like gender provide little benefit and may cause full‑table scans.

InnoDB may skip such indexes when the column occupies roughly 30% of rows.

1.1.4 Simple Index Practices

Index push‑down: use composite indexes for multi‑condition queries.

Covering index: keep all needed columns in the index.

Prefix index for long strings.

Avoid functions on indexed columns.

Consider maintenance cost for frequently updated columns.

1.1.5 Evaluating Wrong Index Choice

Use ANALYZE TABLE x to refresh statistics or FORCE INDEX to override the optimizer.

1.2 MDL Locks

MySQL 5.5 introduced metadata locks (MDL). DDL acquires write locks; DML acquires read locks. Use SHOW PROCESSLIST to see “Waiting for table metadata lock”.

1.3 Flush

Flush can be blocked; monitor with SHOW PROCESSLIST for “Waiting for table flush”.

1.4 Row Locks

Uncommitted write locks can block other transactions.

1.5 Current Read

InnoDB default isolation is REPEATABLE READ; undo logs are applied to present a consistent snapshot.

1.6 Large Table Scenarios

For tables with billions of rows, even optimized indexes may hit I/O or CPU bottlenecks. InnoDB buffer pool size and LRU eviction affect cache hit rate.

1.6.1 Sharding

Horizontal sharding distributes rows across tables; vertical sharding splits columns into separate databases. Tools: Sharding‑Sphere, TDDL, Mycat.

1.6.2 Read/Write Splitting

When reads far exceed writes, use master‑slave replication to offload reads.

2. ElasticSearch Overview

ES is a real‑time distributed search engine built on Lucene, suitable for full‑text search, log analysis, and as a NoSQL JSON store.

2.1 What ES Can Do

Typical use cases include keyword search, log monitoring, and analytics.

2.2 ES Architecture

Before 7.0 the hierarchy was index → type → document; now index is analogous to a table. Mapping defines field types; settings control shards and replicas.

Example mapping snippet:

{
  "mappings": {
    "doc": {
      "properties": {
        "appname": {
          "type": "text",
          "fields": {
            "keyword": {
              "type": "keyword",
              "ignore_above": 256
            }
          }
        }
      }
    }
  }
}

2.3 Why ES Queries Are Fast

ES uses inverted indexes with a term dictionary and a term index stored in memory (FST), enabling rapid term lookup.

2.3.1 Tokenized Search

Terms are tokenized; a query like “Ada” can be located without full table scans.

2.3.2 Exact Search

When the term‑index advantage disappears, performance may be comparable to MySQL covering indexes.

2.4 When to Use ES

Ideal for full‑text search on large text fields, e.g., chat message logs, where MySQL LIKE is inefficient.

2.4.1 Combined Queries

Store searchable fields in ES and full records in MySQL; retrieve IDs from ES then fetch details from MySQL.

2.4.2 ES + HBase

For write‑intensive workloads, pair ES with a column‑store like HBase.

3. HBase Basics

HBase stores data by column families, not rows. RowKey is the primary key and determines data ordering.

3.1 Storage Model

Unlike row‑oriented MySQL, HBase is column‑oriented, supporting sparse data.

3.2 OLTP vs OLAP

HBase is suited for write‑heavy OLTP scenarios but not for complex analytical queries.

3.3 RowKey Design

Only three query patterns are supported: single row, range scan, and full table scan.

3.4 Use Cases

Best for high‑throughput writes and simple lookups by RowKey; provides high reliability and no single point of failure.

4. Conclusion

The article emphasizes systematic debugging, proper index design, and combining technologies (MySQL, ES, HBase) to achieve fast queries in large‑scale systems.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Elasticsearch Query Optimization MySQL HBase

Written by

Top Architect

Top Architect focuses on sharing practical architecture knowledge, covering enterprise, system, website, large‑scale distributed, and high‑availability architectures, plus architecture adjustments using internet technologies. We welcome idea‑driven, sharing‑oriented architects to exchange and learn together.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.