Alibaba Cloud Big Data AI Platform
Mar 20, 2024 · Artificial Intelligence
How M2Doc Boosts Document Layout Analysis with Plug‑in Multimodal Fusion
This article introduces M2Doc, a plug‑in multimodal fusion approach that equips visual‑only object detectors with textual and semantic awareness, detailing its early‑ and late‑fusion modules, experimental validation on DocLayNet, M6Doc and PubLayNet, and future research directions.
AIM2Docdocument layout analysis
0 likes · 8 min read
