Export 1 Billion Elasticsearch Docs in 3 Hours Using PIT + Slice
This guide explains how to reliably export over a billion Elasticsearch documents within a few hours by using Point‑In‑Time (PIT) snapshots combined with parallel Slice processing, covering diagnostics, performance modeling, consistency levels, failure recovery, and resource isolation.
Background and Problem
Exporting billions of records directly from a production Elasticsearch cluster often fails:
from/size > 10,000 triggers errors or OOM.
Scroll API is slow, consumes search contexts, and puts heavy load on the cluster.
Single‑threaded SearchAfter works in theory but can take 50+ hours, which is unacceptable.
How can we export massive data efficiently, consistently, and without impacting online services?
Interview‑Style Answer Structure (Three Layers)
1️⃣ Quick Diagnosis
- Total data volume? Document size?
- One‑time or periodic export?
- Consistency requirement? Strong or eventual?
- Available CPU / memory / network?
- Is it allowed to affect live ES query performance?2️⃣ Solution Evolution
1. Break the limit: avoid from/size.
2. Evolve: Scroll → SearchAfter.
3. Final: PIT + Slice parallelism.3️⃣ Solution Comparison
Scroll (data < 10M): 5‑10 h, high context overhead, ★★.
SearchAfter (single thread) (data < 50M): 50 h+, no parallelism, ★★.
PIT + Slice parallel (data > 1B): 2‑3 h, requires throttling, ★★★★★.
Core Solution: PIT + Slice Parallel Export
1️⃣ Create PIT
CreatePitRequest request = new CreatePitRequest("my_index");
request.keepAlive(TimeValue.timeValueMinutes(30));
CreatePitResponse response = client.createPit(request, RequestOptions.DEFAULT);
String pitId = response.getId();Fixes a point‑in‑time view so the export sees a strong‑consistent snapshot regardless of ongoing writes.
2️⃣ Slice Parallel Strategy
int primaryShards = 10;
int slices = Math.min(primaryShards * 2, 20); // typical 20‑32 slicesGuideline: use 1‑2 slices per primary shard, with a total upper limit of 20‑32 slices.
3️⃣ Search Configuration
SearchSourceBuilder sourceBuilder = new SearchSourceBuilder()
.size(5000)
.sort("_shard_doc") // required for PIT + SearchAfter
.fetchSource(new String[]{"id","create_time","amount"}, null)
.timeout(TimeValue.timeValueMinutes(5));Sorting field must be _shard_doc for PIT + SearchAfter; _doc is used only with Scroll.
4️⃣ Slice + SearchAfter Loop
SliceBuilder slice = new SliceBuilder(sliceId, maxSlices);
while (true) {
SearchRequest request = new SearchRequest()
.source(sourceBuilder)
.slice(slice)
.pointInTimeBuilder(new PointInTimeBuilder(pitId).setKeepAlive("5m"));
SearchResponse response = client.search(request, RequestOptions.DEFAULT);
if (response.getHits().getHits().length == 0) break;
writeToFile(response.getHits());
Object[] sortValues = response.getHits()
.getAt(response.getHits().getHits().length - 1)
.getSortValues();
sourceBuilder.searchAfter(sortValues);
}Index Design Impact
Mapping weight dominates export speed. Recommended mapping:
keyword / date fields – optimal.
text fields – slow and large.
nested fields – performance disaster.
Consistency Level Design
Strong consistency : no data loss – use PIT.
Eventual consistency : seconds‑level lag – use SearchAfter.
Final consistency (reporting) : partition by time – no strict snapshot.
Performance and Capacity Model
Assumptions: 1 KB per document → 1 TB raw for 1 B docs; gzip reduces to ~300 GB.
Network: 1 Gbps ≈ 125 MB/s → theoretical 40 min for 300 GB; real‑world 2‑3 h.
Progress Tracking & Resume (Redis)
{
"status": "running|completed|failed",
"exported": 12000000,
"searchAfter": [...]
}On failure, resume directly from the last searchAfter value without a full restart.
Retry with Exponential Backoff
retryDelay = 2^retryCount * 1000; // msDelays: 1→2 s, 2→4 s, 3→8 s, 4→16 s, etc.
Resource Isolation & Business Protection
network:
throttle_mbps: 100
es:
search_priority: low
storage:
local_ssd: trueData export may be slower, but online services must remain stable.
Data‑Security Boundaries
Mis‑export prevention – dual‑approval workflow.
Sensitive fields – whitelist only.
File leakage – AES encryption.
Host compromise – isolate in a dedicated subnet.
Platform‑Level Design (Enterprise‑Scale)
Modules: Task Management, Permission Approval, PIT Management, Slice Scheduling, Checkpoint‑Resume, Audit Logging.
Why PIT Beats Scroll?
PIT stores only a timestamp, not the whole search context.
PIT is lightweight and avoids context accumulation.
PIT naturally supports Slice for parallelism.
PIT expires automatically, no manual cleanup needed.
One‑Line Summary
This is a complete transformation from a simple export script to an enterprise‑grade Elasticsearch offline data extraction platform.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Ray's Galactic Tech
Practice together, never alone. We cover programming languages, development tools, learning methods, and pitfall notes. We simplify complex topics, guiding you from beginner to advanced. Weekly practical content—let's grow together!
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
