Tag

XYCut

0 views collected around this technical thread.

AntTech
AntTech
Jun 15, 2022 · Artificial Intelligence

XYLayoutLM: Towards Layout-Aware Multimodal Networks for Visually-Rich Document Understanding

XYLayoutLM introduces a layout‑aware multimodal network that improves visually‑rich document understanding by augmenting XY‑Cut for robust reading order generation and employing a Dilated Conditional Position Encoding to handle variable‑length inputs, achieving state‑of‑the‑art performance on XFUN and FUNSD datasets.

Document UnderstandingMultimodalVision Transformer
0 likes · 10 min read
XYLayoutLM: Towards Layout-Aware Multimodal Networks for Visually-Rich Document Understanding