Tagged articles
5 articles
Page 1 of 1
Machine Heart
Machine Heart
May 17, 2026 · Artificial Intelligence

Is Multimodal RAG the Cure for Enterprise Knowledge‑Base Bottlenecks? The ‘Where to Retrieve’ Challenge

The article analyzes how multimodal Retrieval‑Augmented Generation expands retrieval objects beyond text chunks, why the "where to retrieve" problem is as critical as "what to retrieve" in enterprise knowledge bases, and how Google Gemini's File Search and recent industry research illustrate the shift toward verifiable, multimodal evidence.

AI RetrievalDocument AIEnterprise Knowledge Base
0 likes · 7 min read
Is Multimodal RAG the Cure for Enterprise Knowledge‑Base Bottlenecks? The ‘Where to Retrieve’ Challenge
HyperAI Super Neural
HyperAI Super Neural
Sep 26, 2025 · Artificial Intelligence

Redefining Next‑Gen OCR: IBM’s Open‑Source Granite‑Docling‑258M for Unified Structure and Content Understanding

IBM’s newly released open‑source model Granite‑Docling‑258M tackles the long‑standing challenge of converting diverse digital documents into machine‑readable, structured data by preserving layout, tables, formulas, and supporting multiple languages, while remaining lightweight at 258 M parameters and outperforming its predecessor SmolDocling‑256M‑Preview.

DoclingDocument AIIBM
0 likes · 5 min read
Redefining Next‑Gen OCR: IBM’s Open‑Source Granite‑Docling‑258M for Unified Structure and Content Understanding
DataFunTalk
DataFunTalk
Jun 29, 2024 · Artificial Intelligence

Document Intelligence in the Financial Sector: Technologies, Challenges, and Future Directions

This presentation reviews the technical scope of document intelligence, its specific applications and challenges in finance, recent advances in document analysis, recognition, and understanding, and outlines future research directions for large‑model and multimodal solutions in processing complex financial documents.

Deep LearningDocument AIlarge models
0 likes · 28 min read
Document Intelligence in the Financial Sector: Technologies, Challenges, and Future Directions
AntTech
AntTech
Nov 15, 2023 · Artificial Intelligence

Reading Order Matters: Information Extraction from Visually‑rich Documents by Token Path Prediction

The paper identifies reading‑order disorder as a critical obstacle in visually‑rich document information extraction, proposes a Token Path Prediction model with grid‑label formulation, introduces re‑annotated FUNSD‑r and CORD‑r datasets, and demonstrates SOTA performance on NER, entity linking, and reading‑order prediction tasks.

Document AILayout AnalysisNER
0 likes · 17 min read
Reading Order Matters: Information Extraction from Visually‑rich Documents by Token Path Prediction
Laiye Technology Team
Laiye Technology Team
May 18, 2022 · Artificial Intelligence

Overview of Document Intelligence Models: StrucText, LayoutLMv3, and GraphDoc

This article reviews three representative document intelligence models—StrucText, LayoutLMv3, and GraphDoc—detailing their input features, feature fusion strategies, self‑supervised tasks, and underlying architectures, and explains how they learn embeddings for segments, words, or regions to enable classification and key‑value extraction.

Document AILayout Analysisgraph neural networks
0 likes · 15 min read
Overview of Document Intelligence Models: StrucText, LayoutLMv3, and GraphDoc