How to Scrape COVID‑19 Data with Python and Visualize It on a Map
This tutorial shows how to use Python libraries such as requests, lxml, pandas and pyecharts to crawl the latest COVID‑19 statistics from Baidu, store them in an Excel file, and create interactive province‑level maps that illustrate confirmed, death and recovery numbers.
Introduction
The article starts by questioning a statement made by a Chinese academic about vaccine trials and proposes to verify it with data using Python.
Required Libraries
Data acquisition : requests, lxml, json, openpyxl.
Data visualization : pandas, pyecharts.
pip install requests lxml json openpyxl pandas pyechartsData Acquisition
The target URL is https://voice.baidu.com/act/newpneumonia/newpneumonia. After sending a GET request with a custom User‑Agent, the HTML is parsed with lxml.etree and the embedded JSON is extracted via XPath.
import requests
from lxml import etree
import json
import openpyxl
url = 'https://voice.baidu.com/act/newpneumonia/newpneumonia'
headers = {"User-Agent": "your_user_agent"}
response = requests.get(url=url, headers=headers).text
html = etree.HTML(response)
json_text = html.xpath('//script[@type="application/json"]/text()')[0]
result = json.loads(json_text)["component"][0]["caseList"]Saving to Excel
The extracted fields (province, confirmed, deaths, cured) are written into an Excel workbook.
wb = openpyxl.Workbook()
ws = wb.active
ws.title = "国内疫情"
ws.append(["省份", "累计确诊", "死亡", "治愈"])
for line in result:
line_name = [line["area"], line["confirmed"], line["died"], line["crued"]]
line_name = [0 if v == '' else v for v in line_name]
ws.append(line_name)
wb.save('./china.xlsx')Data Visualization
Using pandas to read the Excel file and pyecharts Map to draw three choropleth maps (confirmed, deaths, cured) of China.
import pandas as pd
from pyecharts.charts import Map, Page
from pyecharts import options as opts
pd.set_option('display.unicode.ambiguous_as_wide', True)
pd.set_option('display.unicode.east_asian_width', True)
df = pd.read_excel('china.xlsx')
province = df['省份'].tolist()
confirmed = df['累计确诊'].tolist()
deaths = df['死亡'].tolist()
cured = df['治愈'].tolist()
cured_map = (Map().add("治愈", [list(z) for z in zip(province, cured)], "china")
.set_global_opts(title_opts=opts.TitleOpts(),
visualmap_opts=opts.VisualMapOpts(max_=200)))
confirmed_map = (Map().add("累计确诊", [list(z) for z in zip(province, confirmed)], "china")
.set_global_opts(title_opts=opts.TitleOpts(),
visualmap_opts=opts.VisualMapOpts(max_=200)))
deaths_map = (Map().add("死亡", [list(z) for z in zip(province, deaths)], "china")
.set_global_opts(title_opts=opts.TitleOpts(),
visualmap_opts=opts.VisualMapOpts(max_=200)))
page = Page(layout=Page.DraggablePageLayout)
page.add(confirmed_map, deaths_map, cured_map)
page.render()After rendering, render.html can be further adjusted and saved as my_test.html using Page.save_resize_html with a configuration JSON.
Conclusion
The guide demonstrates the complete workflow from crawling COVID‑19 statistics, storing them in Excel, to visualizing the data on a Chinese map with pyecharts, providing a practical example of Python web scraping and data visualization.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Python Crawling & Data Mining
Life's short, I code in Python. This channel shares Python web crawling, data mining, analysis, processing, visualization, automated testing, DevOps, big data, AI, cloud computing, machine learning tools, resources, news, technical articles, tutorial videos and learning materials. Join us!
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
