Tagged articles
19 articles
Page 1 of 1
Python Programming Learning Circle
Python Programming Learning Circle
Aug 19, 2022 · Backend Development

Essential Python Web Scraping Techniques: GET/POST Requests, Proxy IPs, Cookie Handling, Header Spoofing, Gzip Compression, and Multithreading

This article presents a comprehensive guide to Python web scraping, covering basic GET and POST requests with urllib2, using proxy IPs, managing cookies, disguising as a browser via custom headers, handling gzip-compressed responses, and accelerating crawls with a simple multithreaded worker pool.

GzipProxycookies
0 likes · 8 min read
Essential Python Web Scraping Techniques: GET/POST Requests, Proxy IPs, Cookie Handling, Header Spoofing, Gzip Compression, and Multithreading
Python Programming Learning Circle
Python Programming Learning Circle
Apr 13, 2021 · Backend Development

Python Web Scraping Techniques: GET/POST Requests, Proxy IP, Cookies, Header Spoofing, Gzip Compression, and Multithreading

This article provides a comprehensive Python web‑scraping guide covering basic GET/POST requests with urllib2, proxy handling, cookie management, header manipulation to mimic browsers, gzip compression handling, regular‑expression and library parsing, simple captcha strategies, and a multithreaded thread‑pool example.

GzipHeader SpoofingProxy
0 likes · 8 min read
Python Web Scraping Techniques: GET/POST Requests, Proxy IP, Cookies, Header Spoofing, Gzip Compression, and Multithreading
MaGe Linux Operations
MaGe Linux Operations
Mar 1, 2021 · Backend Development

Bypass Ant Financial Rental Site Anti‑Scraping with Python Cookies

This tutorial explains how to analyze the Ant Short‑Term Rental website's anti‑scraping mechanisms, extract the required Cookie and User‑Agent headers, and use Python's urllib2 and BeautifulSoup to reliably crawl rental listings, save the data to CSV, and optionally extend the scraper with Selenium.

Data Extractionbeautifulsoupcookies
0 likes · 12 min read
Bypass Ant Financial Rental Site Anti‑Scraping with Python Cookies
MaGe Linux Operations
MaGe Linux Operations
Dec 31, 2018 · Backend Development

Master Python Web Scraping: 8 Essential urllib2 Techniques

This guide walks through eight practical Python urllib2 techniques for web crawling, covering basic GET/POST requests, proxy usage, cookie management, header spoofing, page parsing with regex and BeautifulSoup, captcha handling, gzip compression, and multithreaded fetching with a simple thread pool.

GzipProxyPython
0 likes · 8 min read
Master Python Web Scraping: 8 Essential urllib2 Techniques
MaGe Linux Operations
MaGe Linux Operations
Jul 1, 2014 · Backend Development

Master Python Web Scraping: Proxies, Login, Multithreading, and Captcha Hacks

This guide walks through practical Python web‑scraping techniques using urllib2, covering basic page fetching, proxy usage, cookie handling for logins, form submission, header spoofing, anti‑hotlink tricks, multithreaded crawling, and strategies for bypassing simple captchas, all illustrated with code snippets.

CaptchaProxyWeb Scraping
0 likes · 7 min read
Master Python Web Scraping: Proxies, Login, Multithreading, and Captcha Hacks