Tagged articles
5 articles
Page 1 of 1
Data STUDIO
Data STUDIO
Oct 10, 2025 · Fundamentals

Mastering PDF Manipulation in Python with PyPDF2

This article introduces the PDF format, surveys popular Python PDF libraries, and provides a step‑by‑step guide to installing PyPDF2, extracting metadata and text, rotating, merging, splitting, encrypting, and watermarking PDF files using concrete code examples and explanations.

PDF encryptionPDF extractionPDF manipulation
0 likes · 13 min read
Mastering PDF Manipulation in Python with PyPDF2
Full-Stack Cultivation Path
Full-Stack Cultivation Path
Jul 15, 2024 · Fundamentals

Open-Source PDF Table Extraction with Camelot: Quick‑Start Guide

This article explains why extracting tables from PDFs is a common bottleneck, introduces the open‑source Camelot library, walks through installing Ghostscript and Camelot, shows a minimal Python script to convert PDFs to CSV, handles a typical runtime error, and demonstrates the companion Excalibur web UI for interactive extraction.

CamelotExcaliburPDF extraction
0 likes · 5 min read
Open-Source PDF Table Extraction with Camelot: Quick‑Start Guide
Python Crawling & Data Mining
Python Crawling & Data Mining
Oct 16, 2023 · Fundamentals

How to Automate PDF Invoice Cleaning and Splitting with Python

This article walks through a Python automation solution for cleaning and restructuring invoice data extracted from PDFs, detailing how to remove unwanted brackets, split columns, handle encoding issues, and provides sample code and screenshots to guide readers through the process.

PDF extractionautomationinvoice-processing
0 likes · 4 min read
How to Automate PDF Invoice Cleaning and Splitting with Python
Open Source Linux
Open Source Linux
Jan 10, 2022 · Fundamentals

Extract PDF Tables in 3 Lines with Camelot: A Python Guide

Camelot is a Python library that lets you pull tables from PDF files into Pandas DataFrames with just a few lines of code, offering a fast and reliable solution for researchers and developers who need to convert PDF‑embedded tables into usable data.

CLICamelotPDF extraction
0 likes · 4 min read
Extract PDF Tables in 3 Lines with Camelot: A Python Guide