How to Trim Massive JSON Outputs for Real‑World AI Agents
The article explains why raw JSON from document‑parsing APIs overwhelms an AI agent's context window and presents a practical workflow that separates readable Markdown content from metadata, uses prompt engineering, and leverages sandboxed code to keep agents efficient and accurate.
Problem Overview
When building AI agents that need to read long documents, developers often feed the raw JSON response from a document‑parsing API directly into the language model. The JSON contains pixel‑level bounding boxes, OCR confidence scores, and other structural metadata, which quickly exhausts the model’s context window without providing useful reasoning material.
Illustrative Example: Construction Change‑Order Review Agent
A client required an agent to review a 100‑page construction change order, comparing it against contracts and pricing tables. The Reducto parser produced roughly 200 000 lines of JSON, including bounding‑box coordinates for every text block. Loading this JSON into the model prevented any meaningful analysis.
Root Cause
The Reducto parsing API returns a high‑fidelity representation of the source document—coordinates, confidence scores, block types—optimised for document viewers, not for direct consumption by a language model. Most tokens are metadata rather than readable text.
Solution: Separate Content from Metadata
Before the agent processes the data, transform the API response into two artifacts:
Extract the textual content (e.g., .result.chunks[0].content) and write it to a .md file.
Discard coordinates, confidence scores, and other block‑level details from the main context.
Store the discarded metadata in sandbox files (CSV or JSON) for on‑demand queries.
The resulting Markdown file is clean, readable, and suitable for prompting the model.
Preserving Block‑Level References
When the agent must highlight a precise region in the original PDF, the metadata cannot be lost. Number each block in the Markdown and keep the full JSON in the sandbox. The agent can then run a short pandas snippet to retrieve the corresponding bounding box.
块 0 | 第一节:一般条款
块 1 | 承包商应提供所有人工、材料和设备...
块 2 | 附件 B 中规定的单价应作为所有定价的依据...
块 3 | 1.1 工作范围
块 4 | 本变更单涵盖的工作包括对...的修改Prompt Pattern Improvement
"Read every attachment document from start to finish."
This instruction forces the model to load the entire document, exhausting the context budget.
"Answer questions using attachment documents. Employ sub‑agents or tools to manage context. If a file is large, search relevant sections instead of loading the whole document."
The revised prompt encourages the agent to navigate, search, and retrieve only the needed fragments.
Full Pipeline for Document‑Intensive Agents
Document upload
Reducto parsing API (full‑fidelity JSON)
Pre‑processing (~20 lines of code)
Extract block content → numbered .md file
Store metadata → sandbox CSV/JSON files (optional chapter index)
Agent sandbox reads the .md file and reasons over it
When a block reference is needed, the agent runs a short code snippet (e.g., pandas) to pull the bounding box from the stored metadata
Application layer renders the highlighted region in the original PDF
Alternative approaches such as vector‑search chunking, hierarchical summarization, or fine‑tuned retrieval are also viable, but for use‑cases requiring precise cross‑document citations, separating content from metadata and feeding clean Markdown to the agent yields the best results.
Core Principles
Increasing the context window size alone does not solve the problem; a million‑token window filled with 90 % coordinate metadata performs worse than a 200 k‑token window containing clean Markdown plus powerful tooling. Removing irrelevant tokens frees capacity for the reasoning that matters.
Reducto’s API is designed for maximal fidelity, which is ideal for developers building on top of it. However, agents need a trimmed view. Bridging that gap consistently improves accuracy and performance.
Image illustrating the pipeline:
Original source: https://x.com/raunakdoesdev/status/2029610657008783407
High Availability Architecture
Official account for High Availability Architecture.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
