Tag

File Parsing

0 views collected around this technical thread.

Architecture Digest
Architecture Digest
Apr 25, 2025 · Information Security

Integrating Apache Tika with Spring Boot for Sensitive Information Detection and Data Leak Prevention

This guide demonstrates how to integrate Apache Tika into a Spring Boot application to automatically extract file content, detect sensitive data such as ID numbers, credit cards, and phone numbers using regular expressions, and implement data leak protection through a REST API with code examples.

Apache TikaData Leak PreventionFile Parsing
0 likes · 22 min read
Integrating Apache Tika with Spring Boot for Sensitive Information Detection and Data Leak Prevention
Python Programming Learning Circle
Python Programming Learning Circle
Nov 11, 2021 · Fundamentals

Python Techniques for Crawling TXT, CSV, PDF, and Word Documents

This article introduces Python 3 methods for retrieving various document types—including TXT, CSV, PDF, and Word files—using urllib, regular expressions, and file‑specific processing steps, providing practical code examples and workflow guidance for building effective web crawlers.

Data ExtractionFile ParsingPython
0 likes · 3 min read
Python Techniques for Crawling TXT, CSV, PDF, and Word Documents