Master Python Requests: Web Scraping Basics with GET, POST, and File Saving

This tutorial walks you through installing the Python requests library, using GET, POST, and PUT methods, handling query parameters, setting custom headers to bypass anti‑scraping measures, and saving both HTML content and images to local files, complete with runnable code examples.

Raymond Ops
Raymond Ops
Raymond Ops
Master Python Requests: Web Scraping Basics with GET, POST, and File Saving

Key Topics Covered

How web interaction works Using requests.get and requests.post Response object attributes and methods Opening and saving files in Python

Installing the requests Library

For Windows, run:

pip install -i https://pypi.tuna.tsinghua.edu.cn/simple requests

For Linux, prepend sudo if necessary:

sudo pip install -i https://pypi.tuna.tsinghua.edu.cn/simple requests

1. Crawl a Baidu Page

# First crawler example – fetch Baidu homepage
import requests
response = requests.get("http://www.baidu.com")
response.encoding = response.apparent_encoding
print("Status code:" + str(response.status_code))
print(response.text)

2. GET Request Example

# GET request to httpbin.org
import requests
response = requests.get("http://httpbin.org/get")
print(response.status_code)
print(response.text)

3. POST Request Example

# POST request to httpbin.org
import requests
response = requests.post("http://httpbin.org/post")
print(response.status_code)
print(response.text)

4. PUT Request Example

# PUT request to httpbin.org
import requests
response = requests.put("http://httpbin.org/put")
print(response.status_code)
print(response.text)

5. GET with URL Parameters

# GET with query string in URL
import requests
response = requests.get("http://httpbin.org/get?name=hezhi&age=20")
print(response.status_code)
print(response.text)

6. GET with Params Dictionary

# GET using params dict
import requests
data = {"name": "hezhi", "age": 20}
response = requests.get("http://httpbin.org/get", params=data)
print(response.status_code)
print(response.text)

7. POST with Params Dictionary

# POST using params dict
import requests
data = {"name": "hezhi", "age": 20}
response = requests.post("http://httpbin.org/post", params=data)
print(response.status_code)
print(response.text)

8. Bypassing Anti‑Scraping (Example: Zhihu)

# First request without headers – likely blocked
import requests
response = requests.get("http://www.zhihu.com")
print("No headers status:" + str(response.status_code))
# Request with a realistic User‑Agent header
headers = {"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.122 Safari/537.36"}
response = requests.get("http://www.zhihu.com", headers=headers)
print("With headers status:" + str(response.status_code))
print(response.text)

9. Save Crawled HTML to a Local File

import requests
url = "http://www.baidu.com"
response = requests.get(url)
response.encoding = "utf-8"
print("Response type:" + str(type(response)))
print("Status code:" + str(response.status_code))
print("Headers:" + str(response.headers))
print("Content:" + response.text)
# Write to file
with open("D:\\crawler\\baidu.html", "w", encoding="utf-8") as file:
    file.write(response.text)

10. Download an Image and Save It

import requests
response = requests.get("https://www.baidu.com/img/baidu_jgylogo3.gif")
with open("D:\\crawler\\baidu_logo.gif", "wb") as file:
    file.write(response.content)
Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

PythonHTTPTutorialrequestsfile-ioweb-scraping
Raymond Ops
Written by

Raymond Ops

Linux ops automation, cloud-native, Kubernetes, SRE, DevOps, Python, Golang and related tech discussions.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.