Baidu Geek Talk
Jul 26, 2021 · Artificial Intelligence
Document Rendering and Structured Extraction Techniques in Baidu Wenku
Baidu Wenku converts all document types to PDF, parses the PDF into a proprietary format, uses absolute‑position layout for PC rendering, and transforms this into flow‑type structural data for mobile devices by re‑typing layout, extracting OOXML structures, and detecting charts, thereby enabling adaptive layouts, accurate formula rendering, and interactive chart extraction.
Mobile OptimizationOOXML parsingPDF conversion
0 likes · 12 min read