Tagged articles

proxies

3 articles · Page 1 of 1

Oct 19, 2024 · Backend Development

Common Techniques for Python Web Crawling: GET/POST, Proxies, Cookies, Headers, Captcha, Gzip, and Multithreading

This article outlines essential Python web‑crawling techniques—including basic GET/POST requests, proxy usage, cookie management, header spoofing, captcha handling, gzip compression, and multithreaded fetching—to help developers build efficient and robust crawlers.

Pythoncookiescrawling

0 likes · 4 min read

Common Techniques for Python Web Crawling: GET/POST, Proxies, Cookies, Headers, Captcha, Gzip, and Multithreading

Python Programming Learning Circle

Dec 17, 2020 · Backend Development

Request Header Spoofing and Anti‑Anti‑Scraping Techniques for Web Crawlers

This article explains how to disguise a web crawler's identity by customizing request headers, managing request frequency with sleep and proxy settings, and tackling common anti‑scraping mechanisms such as captchas, dynamic loading, and encrypted content using tools like Selenium.

anti-scrapingproxiesrequest headers

0 likes · 6 min read

Request Header Spoofing and Anti‑Anti‑Scraping Techniques for Web Crawlers

Python Programming Learning Circle

Oct 19, 2019 · Backend Development

How to Bypass Anti‑Scraping Measures: User‑Agent, Cookies & Proxies

This guide explains practical techniques such as faking User‑Agent headers, rotating cookies, adding random delays, and using proxy pools to prevent IP bans while crawling large amounts of data from websites with anti‑scraping defenses.

User-AgentWeb Scrapinganti-scraping

0 likes · 4 min read

How to Bypass Anti‑Scraping Measures: User‑Agent, Cookies & Proxies