vivo Internet Technology
Sep 10, 2025 · Artificial Intelligence
How Structured Input Boosts Multimodal LLMs in Document QA Without Retraining
This article presents a training‑free, architecture‑agnostic method that leverages LaTeX‑style structured inputs to preserve document hierarchy and spatial relationships, thereby improving multimodal large language model performance on document question answering tasks across multiple benchmarks.
AIDocQADocument Understanding
0 likes · 8 min read
