Databases 12 min read

Elasticsearch vs MySQL: How Inverted Indexes Enable Faster Complex Queries

This article explains why Elasticsearch handles complex conditional queries more efficiently than MySQL by using inverted indexes, term dictionaries, skip‑list and roaring bitmap structures, while also discussing the trade‑offs such as slower write performance.

dbaplus Community

Apr 4, 2021

Elasticsearch vs MySQL: How Inverted Indexes Enable Faster Complex Queries

Why Elasticsearch Handles Complex Queries

MySQL can use at most one index to filter rows; remaining predicates are evaluated in memory, causing high I/O and CPU usage. Elasticsearch, built on Lucene, stores a separate inverted index for each field, allowing all predicates to be evaluated via index look‑ups. This makes Elasticsearch the de‑facto solution for order, log, and other multi‑condition search scenarios.

Core Concepts Compared with MySQL

Index (Elasticsearch) ≈ Database (MySQL)

Type (removed in ES 7.x) ≈ Table

Document ≈ Row ; a document consists of Fields (columns)

Mapping defines field types, analogous to a relational Schema

Elasticsearch uses its own Query DSL instead of SQL.

Inverted Index and Term Structures

For each searchable field Elasticsearch builds an inverted index that maps terms (e.g., an ISBN or an author name) to a posting list of document IDs containing that term.

Terms are stored in a Term Dictionary sorted alphabetically. Because the dictionary can be large, Elasticsearch creates a Term Index using a Burst‑Trie (a compressed prefix tree). The term index holds only term prefixes, enabling fast navigation to the relevant region of the term dictionary and reducing disk I/O.

When a query requests a specific term, Elasticsearch performs a binary search on the sorted term dictionary (or uses the term index to locate the correct block) and then reads the associated posting list from disk.

Skip‑List Intersection for Multi‑Condition Queries

Posting lists are stored with a multi‑level skip list . To compute the intersection of two conditions (e.g., score = 2.2 AND author = "Tom"), Elasticsearch:

Selects the shorter posting list.

Iterates its document IDs.

Uses the skip list to jump forward in the longer list until it reaches an ID ≥ the current one.

Example posting lists:

Score: [2,3,4,5,7,9,10,11]

Author: [3,8,9,12,13]

Using the skip‑list algorithm, the intersection yields only [3]. The skip list reduces the number of comparisons from O(N × M) to O(N + M) in practice.

Roaring Bitmap Caching (Bitset Strategy)

Elasticsearch also caches posting lists in memory using Roaring Bitmaps , a compressed bitmap format optimized for sparse data sets.

Key properties:

The 32‑bit integer space is divided into 2^16 (= 65 536) containers based on the high 16 bits.

If a container holds ≤ 4 096 entries, it stores them as an ordered unsigned short array (≈ 8 KB).

If a container holds > 4 096 entries, it stores a full 2^16‑bit bitset (fixed 8 KB) which may be further compressed with run‑length encoding (RLE).

This design avoids the 512 MB memory cost of a plain bitset for a full 2^32 range while still supporting fast logical AND operations on posting lists.

Performance Trade‑offs

The rich indexing structures (inverted index, term dictionary, skip lists, Roaring bitmap cache) give Elasticsearch sub‑millisecond response times for complex Boolean queries. However, they introduce overhead during data ingestion:

Indexing is slower than MySQL because each field must be tokenized, written to the term dictionary, and stored in posting lists.

Newly indexed documents become searchable only after a refresh cycle (default 1 s), so Elasticsearch provides eventual consistency rather than immediate visibility.

Conclusion

Elasticsearch’s architecture—field‑level inverted indexes, burst‑trie term indexes, skip‑list intersection, and Roaring bitmap caching—makes it far more suitable than MySQL for queries that involve multiple conditions on large data sets. The trade‑off is higher write latency and a short delay before newly indexed data is searchable.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Elasticsearch MySQL inverted index Skip List Roaring Bitmap Complex Queries

Written by

dbaplus Community

Enterprise-level professional community for Database, BigData, and AIOps. Daily original articles, weekly online tech talks, monthly offline salons, and quarterly XCOPS&DAMS conferences—delivered by industry experts.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.