Tagged articles
6 articles
Page 1 of 1
James' Growth Diary
James' Growth Diary
May 13, 2026 · Artificial Intelligence

Multimodal RAG: A Complete Guide to Ingesting Images, Tables, and PDFs

This article examines the blind spot of pure‑text RAG for visual content, compares three multimodal ingestion strategies—CLIP embeddings, image‑to‑text captioning with a MultiVectorRetriever, and ColPali visual retrieval—covers table‑specific handling, presents end‑to‑end TypeScript implementations, and lists common pitfalls to avoid when deploying production‑grade multimodal RAG pipelines.

CLIPColPaliImage Captioning
0 likes · 22 min read
Multimodal RAG: A Complete Guide to Ingesting Images, Tables, and PDFs
Wu Shixiong's Large Model Academy
Wu Shixiong's Large Model Academy
Mar 22, 2026 · Artificial Intelligence

How to Overcome MinerU’s Top 9 Limitations for Reliable Document Parsing

This article examines MinerU’s strengths and nine critical shortcomings—such as reading order errors, split tables, merged cells, OCR misrecognition, formula handling, heading hierarchy loss, output inconsistency, hardware limits, and licensing issues—and provides concrete improvement strategies and interview‑ready talking points for engineers.

Document ParsingInterview TipsMinerU
0 likes · 12 min read
How to Overcome MinerU’s Top 9 Limitations for Reliable Document Parsing
Continuous Delivery 2.0
Continuous Delivery 2.0
Sep 11, 2025 · Artificial Intelligence

Building Scalable Enterprise RAG: Lessons, Pitfalls, and Proven Solutions

This article shares practical lessons from building a large‑scale enterprise RAG system, covering imperfect data, document quality scoring, hierarchical chunking, metadata design, semantic‑search failures, open‑source model choices, and table handling to achieve reliable AI‑driven search.

Enterprise AIOpen-source modelsRAG
0 likes · 13 min read
Building Scalable Enterprise RAG: Lessons, Pitfalls, and Proven Solutions
Full-Stack Cultivation Path
Full-Stack Cultivation Path
Jul 15, 2024 · Fundamentals

Open-Source PDF Table Extraction with Camelot: Quick‑Start Guide

This article explains why extracting tables from PDFs is a common bottleneck, introduces the open‑source Camelot library, walks through installing Ghostscript and Camelot, shows a minimal Python script to convert PDFs to CSV, handles a typical runtime error, and demonstrates the companion Excalibur web UI for interactive extraction.

CamelotExcaliburPDF extraction
0 likes · 5 min read
Open-Source PDF Table Extraction with Camelot: Quick‑Start Guide
Open Source Linux
Open Source Linux
Jan 10, 2022 · Fundamentals

Extract PDF Tables in 3 Lines with Camelot: A Python Guide

Camelot is a Python library that lets you pull tables from PDF files into Pandas DataFrames with just a few lines of code, offering a fast and reliable solution for researchers and developers who need to convert PDF‑embedded tables into usable data.

CLICamelotPDF extraction
0 likes · 4 min read
Extract PDF Tables in 3 Lines with Camelot: A Python Guide