How to Trim Massive JSON Outputs for Real‑World AI Agents

The article explains why raw JSON from document‑parsing APIs overwhelms an AI agent's context window and presents a practical workflow that separates readable Markdown content from metadata, uses prompt engineering, and leverages sandboxed code to keep agents efficient and accurate.

High Availability Architecture
High Availability Architecture
High Availability Architecture
How to Trim Massive JSON Outputs for Real‑World AI Agents

Problem Overview

When building AI agents that need to read long documents, developers often feed the raw JSON response from a document‑parsing API directly into the language model. The JSON contains pixel‑level bounding boxes, OCR confidence scores, and other structural metadata, which quickly exhausts the model’s context window without providing useful reasoning material.

Illustrative Example: Construction Change‑Order Review Agent

A client required an agent to review a 100‑page construction change order, comparing it against contracts and pricing tables. The Reducto parser produced roughly 200 000 lines of JSON, including bounding‑box coordinates for every text block. Loading this JSON into the model prevented any meaningful analysis.

Root Cause

The Reducto parsing API returns a high‑fidelity representation of the source document—coordinates, confidence scores, block types—optimised for document viewers, not for direct consumption by a language model. Most tokens are metadata rather than readable text.

Solution: Separate Content from Metadata

Before the agent processes the data, transform the API response into two artifacts:

Extract the textual content (e.g., .result.chunks[0].content) and write it to a .md file.

Discard coordinates, confidence scores, and other block‑level details from the main context.

Store the discarded metadata in sandbox files (CSV or JSON) for on‑demand queries.

The resulting Markdown file is clean, readable, and suitable for prompting the model.

Preserving Block‑Level References

When the agent must highlight a precise region in the original PDF, the metadata cannot be lost. Number each block in the Markdown and keep the full JSON in the sandbox. The agent can then run a short pandas snippet to retrieve the corresponding bounding box.

块 0 | 第一节:一般条款
块 1 | 承包商应提供所有人工、材料和设备...
块 2 | 附件 B 中规定的单价应作为所有定价的依据...
块 3 | 1.1 工作范围
块 4 | 本变更单涵盖的工作包括对...的修改

Prompt Pattern Improvement

"Read every attachment document from start to finish."

This instruction forces the model to load the entire document, exhausting the context budget.

"Answer questions using attachment documents. Employ sub‑agents or tools to manage context. If a file is large, search relevant sections instead of loading the whole document."

The revised prompt encourages the agent to navigate, search, and retrieve only the needed fragments.

Full Pipeline for Document‑Intensive Agents

Document upload

Reducto parsing API (full‑fidelity JSON)

Pre‑processing (~20 lines of code)

Extract block content → numbered .md file

Store metadata → sandbox CSV/JSON files (optional chapter index)

Agent sandbox reads the .md file and reasons over it

When a block reference is needed, the agent runs a short code snippet (e.g., pandas) to pull the bounding box from the stored metadata

Application layer renders the highlighted region in the original PDF

Alternative approaches such as vector‑search chunking, hierarchical summarization, or fine‑tuned retrieval are also viable, but for use‑cases requiring precise cross‑document citations, separating content from metadata and feeding clean Markdown to the agent yields the best results.

Core Principles

Increasing the context window size alone does not solve the problem; a million‑token window filled with 90 % coordinate metadata performs worse than a 200 k‑token window containing clean Markdown plus powerful tooling. Removing irrelevant tokens frees capacity for the reasoning that matters.

Reducto’s API is designed for maximal fidelity, which is ideal for developers building on top of it. However, agents need a trimmed view. Bridging that gap consistently improves accuracy and performance.

Image illustrating the pipeline:

Diagram of the pipeline
Diagram of the pipeline

Original source: https://x.com/raunakdoesdev/status/2029610657008783407

AI agentsprompt engineeringDocument ParsingMarkdownmetadata handlingsandboxed code
High Availability Architecture
Written by

High Availability Architecture

Official account for High Availability Architecture.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.