Databases 9 min read

Elasticsearch Bulk Writes Succeed but _source Is Empty: 6‑Hour Debugging Story & Pitfall Guide

When Elasticsearch Bulk API reports successful writes and Count shows documents, but Search and Get return empty _source, a six‑hour investigation reveals the root cause is a disabled _source mapping and provides a step‑by‑step debugging checklist and fix.

Mingyi World Elasticsearch
Mingyi World Elasticsearch
Mingyi World Elasticsearch
Elasticsearch Bulk Writes Succeed but _source Is Empty: 6‑Hour Debugging Story & Pitfall Guide

Phenomenon

Bulk API returns (10, []) – ten documents indexed without errors. Count API reports count: 10. Search API returns hits whose _source fields are empty objects ( {}), and Get API returns _source: null or no _source at all.

Impact

Down‑stream validation that reads the written data fails.

Users repeatedly retry, assuming network, permission or API problems.

Investigation steps taken

1. Bulk format changes

Switched action from { "index": {}, "doc": {...} } to { "_index": "xxx", "_source": {...} } and added _type: "_doc" (7.x).

Adjusted chunk_size, refresh, request_timeout.

Result: Bulk still reported success, Count remained correct, Search still returned empty _source.

2. Refresh timing

Used refresh='wait_for' / refresh=True, called indices.refresh(), added time.sleep(2)3 before querying.

Result: No change; Count was visible, confirming that refresh was not the issue.

3. ES version / API compatibility

Tried both index(..., document=doc) and index(..., body=doc), branched logic for 7.x vs 8.x/9.x.

Result: Still no _source content.

4. Per‑document Index calls

Iterated over data and called client.index() for each document, adding retry and logging.

Result: Same symptom – success responses, correct Count, empty _source.

5. Expanded validation

Added Get calls for each document and extensive logging.

Result: Get also returned empty _source, confirming that the document existed but contained no stored source.

Root cause

The index mapping disables _source:

{
  "mappings": {
    "_source": { "enabled": false },
    "properties": { ... }
  }
}

Consequences:

Documents are indexed, inverted, and searchable, so Count works.

Bulk/Index calls report success because the indexing pipeline succeeds.

The original JSON is not stored; therefore Search and Get cannot return _source.

Why _source.enabled=false appears

LLM‑generated mapping templates sometimes include the setting for “search‑only” use cases.

Copied log or monitoring templates often disable _source to save storage.

Older default templates may have the flag turned off.

Correct fix

When creating the index, explicitly enable _source:

def create_index(self, index_name, mapping=None):
    if mapping:
        mappings_body = dict(mapping.get('mappings', {}))
        # Core: force enable _source
        mappings_body['_source'] = {'enabled': True}
        self.client.indices.create(index=index_name, body={'mappings': mappings_body})
    else:
        self.client.indices.create(index=index_name)

The single line mappings_body['_source'] = {'enabled': True} overrides any enabled: false in the supplied mapping.

Debugging checklist for “bulk succeeds, count >0, _source empty”

Check index mapping: GET /your_index/_mapping Verify that _source.enabled is not false.

Confirm index‑creation logic does not set _source.enabled: false (including generated or copied templates).

If mapping is correct, then revisit write‑path details such as bulk format, refresh options, or API version compatibility.

Practical advice for AI‑assisted debugging

Use hypothesis elimination: a normal Count rules out refresh or shard‑allocation problems.

When Get also lacks _source, discard query‑syntax issues.

Prioritize inspecting index metadata ( _source setting) before adding more code branches.

Leverage official diagnostic APIs:

GET /index/_mapping
GET /index/_settings
GET /index/_doc/{id}

(inspect full response)

After fixing the mapping, remove the extra compatibility and validation layers to keep the codebase concise.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

DebuggingindexingElasticsearchMappingbulk API_source
Mingyi World Elasticsearch
Written by

Mingyi World Elasticsearch

The leading WeChat public account for Elasticsearch fundamentals, advanced topics, and hands‑on practice. Join us to dive deep into the ELK Stack (Elasticsearch, Logstash, Kibana, Beats).

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.