Backend Development 10 min read

Why Elasticsearch Creates Too Many Segments and How Lucene Flush Works

The article explains how Elasticsearch’s use of Lucene’s flush mechanism, concurrent shard writes, and IndexWriter buffering lead to an excess of small segments, outlines the flush conditions, and offers guidance on managing write concurrency for better performance.

DevOps Coach

Nov 26, 2019

Why Elasticsearch Creates Too Many Segments and How Lucene Flush Works

Understanding Lucene Flush in Elasticsearch

In Lucene, a flush means writing in‑memory data to disk and creating a new segment; in Elasticsearch this corresponds to a refresh operation.

Why Excess Segments Matter

During bulk indexing with esrally, more segments than expected were observed. Too many segments degrade search speed, slow index opening (due to random I/O while loading each segment), and increase memory usage.

Search must traverse all segments and merge results.

Opening a shard can become N‑times slower; force‑merging segments reduces open time.

Memory consumption drops after reducing segment count (e.g., from 11 GB to 6 GB in a 1 TB index).

Concurrent Writes per Shard

A single Elasticsearch shard permits multi‑threaded concurrent writes. If a cluster has only one shard, write concurrency equal to the CPU core count can fully utilize the CPU.

Each shard holds an InternalEngine that wraps Lucene’s IndexWriter. The typical write flow is:

// initialization
Directory index = new NIOFSDirectory(Paths.get("/index"));
IndexWriterConfig config = new IndexWriterConfig();
IndexWriter writer = new IndexWriter(index, config);
// create a document
Document doc = new Document();
doc.add(new TextField("url", "www.elasticsearchbook.cn", Field.Store.YES));
// index the document
writer.addDocument(doc);
writer.commit();

Updates acquire a lock on the document’s _id, but new documents are written concurrently via the shared IndexWriter.

How IndexWriter Supports Concurrency

When multiple threads call IndexWriter, it creates a DocumentsWriterPerThread (DWPT) for each thread. Each DWPT has its own buffer, which is eventually flushed as an independent segment file. High write concurrency therefore produces many small segments.

Detailed Document Write Process

Allocate DWPT

For each concurrent thread, a DWPT is allocated and placed in a LIFO list. If the list is empty, a new DWPT is created; otherwise, the most recently used DWPT is reused, limiting the number of active buffers.

if (freeList.isEmpty()) {
    return newThreadState();
} else {
    threadState = freeList.remove(freeList.size() - 1);
}

Write Document to DWPT Buffer

The actual buffer write is omitted for brevity.

Check Flush Conditions

Lucene decides to flush based on three checks:

Document count reaches a threshold (Elasticsearch leaves this unlimited).

Total buffer usage across all DWPTs exceeds a limit (default 10 % of heap memory, derived from indexing buffer).

A single DWPT’s buffer exceeds RAMPerThreadHardLimitMB (default 1945 MB, rarely reached).

Additional flush triggers in Elasticsearch include:

Periodic refresh

Manual refresh

flush (pre‑7.x triggers a refresh; post‑7.x does not)

syncedFlush

Execute Flush

When a DWPT is marked for flush, it is checked out of the thread pool without acquiring a lock, so ongoing writes continue using a new DWPT. The flush itself runs on the marked DWPT, and the operation does not block bulk indexing.

During a periodic refresh, Elasticsearch calls Lucene’s flushAllThreads(), which marks all DWPTs as pending, moves them to flush queues, and then flushes them without blocking new writes.

Key Takeaways

This analysis (based on Elasticsearch 7.1) shows that Lucene’s flush writes data to the OS cache (similar to an Elasticsearch refresh) but does not perform an fsync; Elasticsearch’s flush maps to Lucene’s commit .

A single shard can handle concurrent writes efficiently, but excessive concurrency per shard creates many DWPTs, leading to a proliferation of small segments. When bulk indexing, balance the total concurrency across shards to avoid segment explosion.

The principles are similar in other systems such as HBase, where a memstore is flushed to disk while new writes continue in a fresh memstore.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Elasticsearch lucene Segment Flush IndexWriter

Written by

DevOps Coach

Master DevOps precisely and progressively.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.