Visualizing Historical National Games Medal Rankings with Python
This tutorial demonstrates how to collect, clean, and visualize historical Chinese National Games medal data using Python, covering data extraction with regular expressions, organization into pandas DataFrames, and creating both static line charts with Matplotlib and interactive charts with PyEcharts, complete with code snippets and practical tips.
This article shows a step‑by‑step example of gathering and visualizing medal tables from past Chinese National Games using Python.
The raw data is obtained from a public web page, then cleaned with Sublime Text’s regular‑expression replace feature to leave only pipe‑separated rows.
The cleaned text is read into Python, split on the "┃" delimiter, and each edition’s results are stored in a pandas.DataFrame . All DataFrames are collected in a list and written to an Excel workbook:
import pandas as pd
is_change = True
count = 12
all_data = [] # record all editions
data = [] # record one edition
writer = pd.ExcelWriter('history.xls')
watches = set()
with open('history.txt') as f:
for line in f:
if line.strip():
if is_change:
is_change = False
data.append([l.strip() for l in line.split('┃')])
else:
if not is_change:
df = pd.DataFrame(data)
all_data.append(df)
df.to_excel(writer, sheet_name=f'No.{count}', index=False)
top3 = list(df[1][:3])
watches.update(top3)
is_change = True
count -= 1
data = []
writer.save()After reversing the list of DataFrames, the script extracts the gold‑medal counts for the teams that have ever placed in the top three (Shanghai, Beijing, Shandong, Guangdong, Jiangsu, PLA, Liaoning) and stores them in a dictionary:
gold = {}
all_data.reverse()
for team in watches:
gold[team] = []
for data in all_data:
gold[team].append(float(data[data[1]==team][2]))Matplotlib is then used to draw a static line chart that compares the gold‑medal trends of these teams over the years:
import matplotlib.pyplot as plt
for team in gold:
plt.plot(gold[team], linewidth=4, label=team)
plt.legend()Because Chinese characters may appear as squares, the script sets a custom font (e.g., SimHei) to render them correctly:
from pylab import mpl
font = mpl.font_manager.FontProperties(fname='simhei.ttf', size=15)
plt.legend(prop=font)
# alternative method
plt.rcParams['font.sans-serif'] = ['SimHei']For interactive visualizations, PyEcharts is employed. The same data is plotted with a line chart that supports mouse hover, data hiding/showing, and can be rendered directly in a Jupyter notebook or saved as an HTML file:
from pyecharts.charts import Line
line = Line()
line.add_xaxis(xaxis_data=range(1,13))
for team in gold:
line.add_yaxis(team, gold[team])
line.render_notebook() # use .render('file.html') outside JupyterTwo practical notes are given: the code was tested in a Jupyter notebook, so .render_notebook() works out of the box; and PyEcharts has undergone major updates, so users should consult the documentation that matches their installed version.
The article ends with a disclaimer that the content is collected from the web and the original author holds the copyright.
Python Programming Learning Circle
A global community of Chinese Python developers offering technical articles, columns, original video tutorials, and problem sets. Topics include web full‑stack development, web scraping, data analysis, natural language processing, image processing, machine learning, automated testing, DevOps automation, and big data.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.