Python Web Scraping of China Weather Forecast (7‑Day) Using Requests, lxml, and CSV Export
This tutorial shows how to crawl the China Meteorological Administration website to obtain today's weather and a six‑day forecast, handle Chinese encoding, extract data with XPath, and save the results into a CSV file using Python's requests and lxml libraries.
The article demonstrates how to crawl weather forecast information from the China Meteorological Administration website (http://www.weather.com.cn/), retrieving both today's weather and the forecast for the next six days.
Because the required data resides in static HTML, the page can be parsed directly and the relevant elements located via XPath expressions.
Code implementation begins by importing the necessary libraries:
import requests from icecream import ic from lxml import etree
Since the page contains Chinese characters, the script first prints the response encoding and then forces the response to use the detected encoding to avoid garbled text:
print(res.encoding) # ISO-8859-1 (example output) resp.encoding = resp.apparent_encoding
After fixing the encoding, the HTML content is parsed with etree.HTML , and an XPath query extracts all li elements that hold daily weather data:
html = etree.HTML(content) uls = html.xpath("//div[@class='left-div'][1]/div[@id='7d']/ul/li") print(len(uls)) # 7
For each li element, the script extracts the date, weather description, low and high temperatures, and wind information using specific XPath expressions, producing output similar to:
ic| date: ['7日(今天)', '8日(明天)', '9日(后天)', '10日(周五)', '11日(周六)', '12日(周日)', '13日(周一)'] weather: ['阴', '阴转多云', '多云转阴', '小雨', '小雨转阴', '晴', '晴'] high_temp: ['12', '7', '11', '10', '8', '8', '10'] low_temp: ['2℃', '0℃', '1℃', '2℃', '-1℃', '0℃', '0℃'] wind: ['3-4级转<3级', '<3级', '<3级', '<3级', '3-4级转<3级', '<3级', '<3级']
The extracted data is then written to a local CSV file (named "西安天气.csv") with UTF‑8 encoding:
with open('西安天气.csv', 'a+', encoding='utf-8') as file: for i in range(7): file.write(date[i] + ':\t') file.write(weather[i] + '\t') file.write('最高气温:' + high_temp[i] + '\t') file.write('最低气温:' + low_temp[i] + '\t') file.write('风力:' + wind[i] + '\t') file.write('\n')
Finally, the resulting CSV content is displayed, showing the date, weather condition, high/low temperatures, and wind level for each of the seven days.
*Disclaimer: This article is compiled from online sources; copyright belongs to the original author. If any rights are infringed, please contact us for removal or authorization.*
Python Programming Learning Circle
A global community of Chinese Python developers offering technical articles, columns, original video tutorials, and problem sets. Topics include web full‑stack development, web scraping, data analysis, natural language processing, image processing, machine learning, automated testing, DevOps automation, and big data.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.