Why Elasticsearch Creates Too Many Segments and How Lucene Flush Works
The article explains how Elasticsearch’s use of Lucene’s flush mechanism, concurrent shard writes, and IndexWriter buffering lead to an excess of small segments, outlines the flush conditions, and offers guidance on managing write concurrency for better performance.
Understanding Lucene Flush in Elasticsearch
In Lucene, a flush means writing in‑memory data to disk and creating a new segment; in Elasticsearch this corresponds to a refresh operation.
Why Excess Segments Matter
During bulk indexing with esrally, more segments than expected were observed. Too many segments degrade search speed, slow index opening (due to random I/O while loading each segment), and increase memory usage.
Search must traverse all segments and merge results.
Opening a shard can become N‑times slower; force‑merging segments reduces open time.
Memory consumption drops after reducing segment count (e.g., from 11 GB to 6 GB in a 1 TB index).
Concurrent Writes per Shard
A single Elasticsearch shard permits multi‑threaded concurrent writes. If a cluster has only one shard, write concurrency equal to the CPU core count can fully utilize the CPU.
Each shard holds an InternalEngine that wraps Lucene’s IndexWriter. The typical write flow is:
// initialization
Directory index = new NIOFSDirectory(Paths.get("/index"));
IndexWriterConfig config = new IndexWriterConfig();
IndexWriter writer = new IndexWriter(index, config);
// create a document
Document doc = new Document();
doc.add(new TextField("url", "www.elasticsearchbook.cn", Field.Store.YES));
// index the document
writer.addDocument(doc);
writer.commit();Updates acquire a lock on the document’s _id, but new documents are written concurrently via the shared IndexWriter.
How IndexWriter Supports Concurrency
When multiple threads call IndexWriter, it creates a DocumentsWriterPerThread (DWPT) for each thread. Each DWPT has its own buffer, which is eventually flushed as an independent segment file. High write concurrency therefore produces many small segments.
Detailed Document Write Process
Allocate DWPT
For each concurrent thread, a DWPT is allocated and placed in a LIFO list. If the list is empty, a new DWPT is created; otherwise, the most recently used DWPT is reused, limiting the number of active buffers.
if (freeList.isEmpty()) {
return newThreadState();
} else {
threadState = freeList.remove(freeList.size() - 1);
}Write Document to DWPT Buffer
The actual buffer write is omitted for brevity.
Check Flush Conditions
Lucene decides to flush based on three checks:
Document count reaches a threshold (Elasticsearch leaves this unlimited).
Total buffer usage across all DWPTs exceeds a limit (default 10 % of heap memory, derived from indexing buffer).
A single DWPT’s buffer exceeds RAMPerThreadHardLimitMB (default 1945 MB, rarely reached).
Additional flush triggers in Elasticsearch include:
Periodic refresh
Manual refresh
flush (pre‑7.x triggers a refresh; post‑7.x does not)
syncedFlush
Execute Flush
When a DWPT is marked for flush, it is checked out of the thread pool without acquiring a lock, so ongoing writes continue using a new DWPT. The flush itself runs on the marked DWPT, and the operation does not block bulk indexing.
During a periodic refresh, Elasticsearch calls Lucene’s flushAllThreads(), which marks all DWPTs as pending, moves them to flush queues, and then flushes them without blocking new writes.
Key Takeaways
This analysis (based on Elasticsearch 7.1) shows that Lucene’s flush writes data to the OS cache (similar to an Elasticsearch refresh) but does not perform an fsync; Elasticsearch’s flush maps to Lucene’s commit .
A single shard can handle concurrent writes efficiently, but excessive concurrency per shard creates many DWPTs, leading to a proliferation of small segments. When bulk indexing, balance the total concurrency across shards to avoid segment explosion.
The principles are similar in other systems such as HBase, where a memstore is flushed to disk while new writes continue in a fresh memstore.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
