How to Achieve Fast Queries: MySQL Index Optimization, Large‑Table Strategies, Elasticsearch Basics, and HBase Overview
This article explains common causes of slow MySQL queries, how proper indexing and lock handling can improve performance, introduces Elasticsearch’s inverted‑index advantages and suitable use cases, and outlines HBase’s column‑family storage model and row‑key design for large‑scale data.
1. MySQL Slow Query Experience
Most internet applications are read‑heavy, so query speed is critical. Slow queries often stem from improper or missing indexes, index misuse, or lock contention.
1.1 Indexes
MySQL uses B+‑tree indexes. Effective use includes left‑most prefix, composite indexes, index push‑down, covering indexes, and prefix indexes. Index failure reasons include WHERE clauses with !=, LIKE '%…', functions on indexed columns, low cardinality fields, and not matching the leftmost prefix.
1.1.1 Why Indexes Fail
When an index is not used, EXPLAIN can reveal the issue. Common pitfalls are function calls (e.g., WHERE LENGTH(a)=6) and implicit type or charset conversions.
1.1.2 Low‑Cardinality Fields
Fields like gender have low selectivity; scanning the whole table is often cheaper than using such an index.
1.1.3 Practical Index Tips
Index push‑down for multi‑condition queries.
Covering (composite) indexes to avoid table lookups.
Prefix indexes for long strings.
Avoid functions on indexed columns.
Consider maintenance cost for frequently updated columns.
1.2 Lock Types
MySQL 5.5 introduced MDL (metadata lock). Read locks are taken for CRUD, write locks for DDL. Use SHOW PROCESSLIST to detect Waiting for table metadata lock, Waiting for table flush, or row‑level lock waits.
1.3 Large‑Table Solutions
When tables reach billions of rows, CPU or I/O becomes a bottleneck. Common remedies are sharding (horizontal) or vertical partitioning, and read‑write separation using master‑slave replication. Tools like Sharding‑Sphere, TDDL, or Mycat help implement these patterns.
2. When to Use Elasticsearch
Elasticsearch (ES) is a Lucene‑based distributed search engine suited for full‑text search, log analysis, and NoSQL‑style JSON storage.
2.1 What ES Can Do
ES supports near‑real‑time search, the ELK stack (Elasticsearch‑Logstash‑Kibana), and can handle massive log volumes.
2.2 ES Structure
Before ES 7.0 the hierarchy was index → type → document; now index → document. Mapping defines field types, and settings control shard and replica counts.
2.3 Why ES Queries Are Fast
ES stores an inverted index (term dictionary + posting list) with an in‑memory term index (FST) that quickly locates terms, making prefix and full‑text searches much faster than MySQL’s row‑based scans.
2.3.1 Tokenized Search
Queries like match_phrase match exact token sequences, ideal for log searches.
2.3.2 Exact Search
For exact matches, ES may be comparable to MySQL, especially when covering indexes are used.
2.4 When to Choose ES
Full‑text search where MySQL’s LIKE '%…' is inefficient.
Combined queries where ES handles search and MySQL stores the authoritative data.
3. HBase Overview
HBase is a column‑family NoSQL store. Data is stored by row key (lexicographically sorted) with versions (timestamps). Columns belong to families, allowing sparse, wide tables.
3.1 Storage Model
Unlike relational rows, HBase stores data column‑wise, enabling efficient writes and scans over row ranges.
3.2 OLTP vs. OLAP
HBase excels at write‑heavy workloads (OLTP) but is not designed for complex analytical queries (OLAP).
3.3 RowKey Design
Only three query patterns are supported: single‑row get, range scan, and full table scan. Good row‑key design is crucial for performance.
3.4 Use Cases
Ideal for massive write workloads, time‑series data, and scenarios where low‑latency inserts are required.
4. Summary
Effective fast queries rely on proper indexing, lock awareness, and appropriate scaling techniques such as sharding or read‑write separation. Elasticsearch provides fast full‑text capabilities, while HBase offers scalable write‑heavy storage. Choose the right tool based on query patterns and data volume.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Top Architect
Top Architect focuses on sharing practical architecture knowledge, covering enterprise, system, website, large‑scale distributed, and high‑availability architectures, plus architecture adjustments using internet technologies. We welcome idea‑driven, sharing‑oriented architects to exchange and learn together.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
