Student Score Ranking and Distribution Analysis Using Python and Tencent Hunyuan Model
Using Tencent's Hunyuan model, the tutorial walks through a Python workflow that scrapes a student‑score table from a web page, saves it as CSV and Excel, cleans missing values, computes total and average scores, and visualizes their distributions with matplotlib, illustrating how LLMs can accelerate data‑analysis coding while still needing human verification.
Recently Tencent released its self‑developed large language model, Hunyuan, which can understand natural language, generate code, and assist in programming tasks. This article demonstrates how Hunyuan can help develop a common Python data‑analysis workflow: fetching a student‑score table from a web page, storing the data, cleaning it, performing calculations, and visualizing the results.
Steps :
Get data – the tutorial uses a test URL https://python666.cn/static/score.html that contains a single HTML table. Hunyuan provides code to download the page and extract the table into a pandas DataFrame.
Store data – the DataFrame is first saved as a CSV file and then, at the request of the author, converted to an Excel file using DataFrame.to_excel.
Read data – the saved Excel file is read back with pd.read_excel so that further processing can be performed without repeatedly scraping the web page.
Clean data – missing scores are filled with 0 via df.fillna(0), ensuring that subsequent calculations are not affected by NaN values.
Process data – total and average scores for each student are computed using df.iloc[:, 2:11].sum(axis=1) and df.iloc[:, 2:11].mean(axis=1), and the results are added as new columns "总分" and "平均分".
Visualize data – distribution histograms for total scores and average scores are plotted with matplotlib. The tutorial shows how to set a Chinese font (e.g., 'Songti SC') to display labels correctly, and how to arrange the two histograms in a 2×1 subplot layout.
Summary – the final section reflects on the usefulness of Hunyuan as a coding assistant, noting that while the model can generate functional code quickly, developers still need to verify and adjust the output for specific business logic.
The complete Python script produced with Hunyuan’s assistance is shown below:
import pandas as pd
import matplotlib.pyplot as plt
# 读取Excel中数据
df = pd.read_excel("output.xlsx", sheet_name="Sheet1")
# 对空值进行填充
df = df.fillna(0)
# 计算总分和平均分
mean_values = df.iloc[:, 2:11].mean(axis=1)
sum_values = df.iloc[:, 2:11].sum(axis=1)
df["总分"] = sum_values
df["平均分"] = mean_values
# 设置中文字体
plt.rcParams['font.sans-serif'] = ['Songti SC']
plt.rcParams['axes.unicode_minus'] = False
# 创建一个2x1的子图表布局
fig, axes = plt.subplots(nrows=2, ncols=1, figsize=(6, 8))
# 绘制总分直方图
axes[0].hist(df['总分'], bins=20, color='blue', edgecolor='black', alpha=0.7)
axes[0].set_title('总分')
axes[0].set_xlabel('分数')
axes[0].set_ylabel('人数')
# 绘制平均分直方图
axes[1].hist(df['平均分'], bins=20, color='red', edgecolor='black', alpha=0.7)
axes[1].set_title('平均分')
axes[1].set_xlabel('分数')
axes[1].set_ylabel('人数')
# 显示图表
plt.tight_layout()
plt.show()Overall, the example shows that large language models like Hunyuan can significantly accelerate development of data‑analysis scripts by generating boiler‑plate code, handling library APIs, and producing visualizations, while still requiring human oversight for domain‑specific adjustments.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Tencent Cloud Developer
Official Tencent Cloud community account that brings together developers, shares practical tech insights, and fosters an influential tech exchange community.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
