Tagged articles

Document processing

23 articles · Page 1 of 1

AI Large-Model Wave and Transformation Guide

Jun 15, 2026 · Artificial Intelligence

Top 5 Must-Install VSCode Claude Code Skills for 2026

The article explains why Claude Code can misbehave, introduces the Skill system as a set of coding conventions and domain knowledge, recommends five essential Skills with exact install commands, provides a pitfall‑avoidance table, compares Copilot and Claude Code paths, and suggests a minimal effective Skill combo.

AI codingClaude CodeDocument processing

0 likes · 8 min read

Top 5 Must-Install VSCode Claude Code Skills for 2026

SuanNi

Apr 30, 2026 · Artificial Intelligence

Deploy a 24/7 Document Recognition Toolbox with the PaddleOCR Image on the Cloud

This guide explains how to use Baidu's open‑source PaddleOCR engine—its full OCR and layout analysis pipeline, multi‑language support, and output formats—to set up a continuously running document recognition service on the 算网 GPU cloud platform, including environment preparation, model configuration, and inference execution.

Document processingGPUMagicMind

0 likes · 6 min read

Deploy a 24/7 Document Recognition Toolbox with the PaddleOCR Image on the Cloud

Java Tech Enthusiast

Mar 7, 2026 · Artificial Intelligence

Explore Cutting‑Edge Open‑Source AI Skills for Video, Docs, and Social Media Automation

This article introduces several open‑source AI Skills—including Remotion, YouTube‑clipper, skill‑from‑masters, NotebookLM, Markdown‑to‑X publisher, and Anthropic's Agent Skills—detailing their purpose, core features, installation commands, and repository links for developers seeking automation solutions.

ClaudeDocument processingOpen-source

0 likes · 7 min read

Explore Cutting‑Edge Open‑Source AI Skills for Video, Docs, and Social Media Automation

Old Zhang's AI Learning

Jan 28, 2026 · Artificial Intelligence

RAG-Anything: A Universal RAG Framework for PDFs, Office Docs, and Images

RAG-Anything is an open-source, end-to-end multimodal RAG framework that ingests PDFs, Office files, images, and scientific papers, parses them with high fidelity using MinerU, builds a multimodal knowledge graph, and enables hybrid retrieval, while noting resource and dependency considerations.

AIDocument processingKnowledge Base

0 likes · 7 min read

RAG-Anything: A Universal RAG Framework for PDFs, Office Docs, and Images

Old Meng AI Explorer

Jan 18, 2026 · Artificial Intelligence

How BabelDOC Preserves PDF Layout While Translating & OneAIFW Shields Your Data

Two open‑source projects—BabelDOC, a Python‑based PDF translator that retains original formatting using AI models, and OneAIFW, a Zig‑and‑Rust local AI firewall that anonymizes sensitive data before LLM queries—offer practical, privacy‑preserving solutions for researchers and developers.

AI privacyData ProtectionDocument processing

0 likes · 8 min read

How BabelDOC Preserves PDF Layout While Translating & OneAIFW Shields Your Data

php Courses

Jan 13, 2026 · Artificial Intelligence

Boosting Document Barcode Extraction with PHP and AI: A Step‑by‑Step Guide

This article explains how to combine PHP with AI services to reliably locate, decode, and batch‑process barcodes from scanned documents and PDFs, covering tool setup, code examples, performance tips, and security considerations.

AIBarcode ExtractionBatch Automation

0 likes · 11 min read

Boosting Document Barcode Extraction with PHP and AI: A Step‑by‑Step Guide

DataFunSummit

Oct 30, 2025 · Artificial Intelligence

How Multimodal Large Models Are Revolutionizing Document Processing and OCR

This article explores how the explosion of unstructured data exposes the limits of traditional OCR and shows how emerging multimodal large language models provide end‑to‑end document understanding, reduce pipeline complexity, cut training costs, enable hybrid retrieval‑augmented generation, and drive real‑world industry deployments.

AIDocument processingLarge Language Model

0 likes · 28 min read

How Multimodal Large Models Are Revolutionizing Document Processing and OCR

Old Meng AI Explorer

Oct 30, 2025 · Artificial Intelligence

How PaddleOCR Turns Handwritten Notes and PDFs into Editable Text in Seconds

This article explains how PaddleOCR, an open‑source OCR engine from Baidu, achieves high‑accuracy text extraction from handwritten notes, scanned PDFs, invoices, IDs and multilingual documents, offering offline cross‑platform support, free commercial use, and step‑by‑step guidance for rapid deployment.

AutomationDocument processingOCR

0 likes · 10 min read

How PaddleOCR Turns Handwritten Notes and PDFs into Editable Text in Seconds

DataFunSummit

Jul 23, 2025 · Artificial Intelligence

Multimodal RAG: Techniques, Challenges, and Scaling the Future of AI

This article presents a comprehensive overview of multimodal Retrieval‑Augmented Generation (RAG), detailing three implementation paths—semantic extraction, Transformer‑based, and Visual Language Model approaches—along with scaling strategies using tensor indexing, performance comparisons, and guidance on selecting the most suitable technical route.

AI RetrievalDocument processingMultimodal RAG

0 likes · 12 min read

Multimodal RAG: Techniques, Challenges, and Scaling the Future of AI

Sohu Tech Products

Jan 8, 2025 · Artificial Intelligence

Multimodal RAG: Implementation Paths and Development Prospects

The talk outlines Multimodal RAG implementation routes, comparing OCR‑based object recognition, transformer encoder‑decoder encoding, and Visual Language Model processing, explains the ColPali late‑interaction method for multi‑dimensional vector matching, addresses scaling tensors with binarization and reranking, and recommends a hybrid long‑term strategy where VLM excels on abstract imagery while traditional OCR remains valuable.

ColPaliDocument processingMultimodal RAG

0 likes · 10 min read

Multimodal RAG: Implementation Paths and Development Prospects

Alibaba Cloud Developer

Dec 11, 2024 · Artificial Intelligence

How to Extract Multimodal File Information with AI on Alibaba Cloud

This tutorial walks you through using Alibaba Cloud's Bailei AI service to deploy a web service that extracts text, images, audio, and video information from multimodal documents, covering resource setup, application deployment, and step‑by‑step extraction examples.

AIAlibaba CloudDocument processing

0 likes · 5 min read

How to Extract Multimodal File Information with AI on Alibaba Cloud

Lobster Programming

Nov 1, 2024 · Backend Development

How to Parse PDFs and Extract Metadata with Apache Tika and Spring Boot

This guide explains Apache Tika's document parsing capabilities, shows how to download and run the Tika app, demonstrates extracting text and metadata from a PDF, and provides step‑by‑step instructions for integrating Tika into a Spring Boot project with full code examples.

Apache TikaDocument processingJava

0 likes · 7 min read

How to Parse PDFs and Extract Metadata with Apache Tika and Spring Boot

Python Programming Learning Circle

May 13, 2024 · Fundamentals

Using python-docx to Create and Manipulate Word Documents in Python

This article introduces the python-docx library, explains how to install it, and provides a complete code example that demonstrates creating a Word document with headings, styled paragraphs, images, tables, and saving the file, helping Python developers automate document processing.

Document processingWordpython-docx

0 likes · 5 min read

Using python-docx to Create and Manipulate Word Documents in Python

Python Programming Learning Circle

Feb 18, 2024 · Backend Development

Introduction, Installation, and Usage of PyMuPDF (Python Bindings for MuPDF)

This article provides a comprehensive overview of PyMuPDF, covering its purpose as Python bindings for the lightweight MuPDF viewer, detailed installation instructions, essential dependencies, naming conventions, and extensive usage examples for opening documents, accessing pages, extracting text and images, manipulating PDFs, and saving changes.

Document processingLibraryMuPDF

0 likes · 12 min read

Introduction, Installation, and Usage of PyMuPDF (Python Bindings for MuPDF)

Python Programming Learning Circle

Nov 30, 2023 · Fundamentals

Introduction and Usage Guide for PyMuPDF (Python Bindings for MuPDF)

This article provides a comprehensive overview of PyMuPDF, covering its relationship to MuPDF, core features, installation methods, import conventions, and detailed usage examples for opening documents, handling pages, extracting text and images, and performing PDF-specific operations such as merging, splitting, and saving.

Document processingLibraryMuPDF

0 likes · 12 min read

Introduction and Usage Guide for PyMuPDF (Python Bindings for MuPDF)

DataFunTalk

Nov 10, 2022 · Artificial Intelligence

A Comprehensive Overview of OCR Technology Development and Engineering Practices

This article reviews the 40‑year evolution of Optical Character Recognition, discusses its integration with Intelligent Document Processing, outlines recent research hotspots such as scene text recognition and domain‑specific symbol detection, and shares practical engineering experiences and future directions from Datagrand.

Document processingIntelligent Document ProcessingOCR

0 likes · 24 min read

A Comprehensive Overview of OCR Technology Development and Engineering Practices

Sohu Tech Products

Sep 28, 2022 · Fundamentals

PyMuPDF (Python bindings for MuPDF) – Introduction, Features, Installation and Usage Guide

This article provides a comprehensive overview of PyMuPDF, the Python binding for the lightweight MuPDF library, covering its purpose, supported document formats, key features such as rendering, text extraction and PDF manipulation, installation methods, and detailed code examples for common operations.

Document processingMuPDFPDF

0 likes · 13 min read

PyMuPDF (Python bindings for MuPDF) – Introduction, Features, Installation and Usage Guide

Laiye Technology Team

Jul 16, 2022 · Artificial Intelligence

Seal (Stamp) Recognition in Intelligent Document Processing: Challenges, Methods, and Experiments

This article explains how intelligent document processing uses deep‑learning‑based seal detection and OCR techniques—enhanced YOLOv5, multi‑label loss, combined NMS, and end‑to‑end models such as Mask‑TextSpotter, ABCNet, PGNet, and TrOCR—to overcome diverse stamp styles, background interference, and image quality issues, presenting experimental results that surpass commercial OCR vendors.

AIDocument processingOCR

0 likes · 13 min read

Seal (Stamp) Recognition in Intelligent Document Processing: Challenges, Methods, and Experiments

Python Programming Learning Circle

May 9, 2022 · Fundamentals

Introduction and Usage Guide for PyMuPDF (Python Bindings for MuPDF)

This article provides a comprehensive overview of PyMuPDF, the Python binding for the lightweight MuPDF library, covering its installation, core features such as page rendering, text and image extraction, PDF manipulation, and detailed code examples for common document‑processing tasks.

Document processingMuPDFPDF

0 likes · 12 min read

Python Programming Learning Circle

Jan 7, 2022 · Fundamentals

Using python-docx: Document Structure and Basic Operations

This article introduces the python‑docx library, explains its document model—including Document, Paragraph, Run, and Table objects—and provides practical Python code examples for creating, modifying, and styling Word documents, inserting headings, page breaks, tables, and images.

Document processingWord Automationcode example

0 likes · 6 min read

Using python-docx: Document Structure and Basic Operations

Programmer DD

Jul 10, 2020 · Fundamentals

How Search Engines Work: Inside Document and Query Processing

This article explains the core components of a search engine—document processing, query processing, and matching—detailing each step from indexing to ranking, and discusses the document features that influence relevance, providing a comprehensive overview of information retrieval fundamentals.

Document processingInformation RetrievalQuery Processing

0 likes · 20 min read

How Search Engines Work: Inside Document and Query Processing

Architect

Jun 22, 2020 · Fundamentals

Fundamentals of Search Engine Architecture: Document Processing, Query Processing, Indexing, and Matching

This article explains the core components and processing steps of a search engine—document processor, query processor, indexing, and matching—detailing how documents are normalized, tokenized, filtered, weighted, and stored in an inverted index to support effective information retrieval.

Document processingInformation RetrievalQuery Processing

0 likes · 20 min read

Fundamentals of Search Engine Architecture: Document Processing, Query Processing, Indexing, and Matching

Python Programming Learning Circle

Oct 25, 2019 · Backend Development

Automate Word with Python: Master win32com for Document Manipulation

This tutorial explains how to use Python's win32com library to control Microsoft Word, covering installation, creating and displaying documents, working with Selection, Range, Font, ParagraphFormat, PageSetup and Styles objects, and providing a complete example that formats a document to meet national standards.

COMDocument processingPython automation

0 likes · 14 min read

Automate Word with Python: Master win32com for Document Manipulation