Artificial Intelligence 8 min read

Student Score Ranking and Distribution Analysis Using Python and Tencent Hunyuan Model

Using Tencent's Hunyuan model, the tutorial walks through a Python workflow that scrapes a student‑score table from a web page, saves it as CSV and Excel, cleans missing values, computes total and average scores, and visualizes their distributions with matplotlib, illustrating how LLMs can accelerate data‑analysis coding while still needing human verification.

Tencent Cloud Developer

Dec 7, 2023

Student Score Ranking and Distribution Analysis Using Python and Tencent Hunyuan Model

Recently Tencent released its self‑developed large language model, Hunyuan, which can understand natural language, generate code, and assist in programming tasks. This article demonstrates how Hunyuan can help develop a common Python data‑analysis workflow: fetching a student‑score table from a web page, storing the data, cleaning it, performing calculations, and visualizing the results.

Steps :

Get data – the tutorial uses a test URL https://python666.cn/static/score.html that contains a single HTML table. Hunyuan provides code to download the page and extract the table into a pandas DataFrame.

Store data – the DataFrame is first saved as a CSV file and then, at the request of the author, converted to an Excel file using DataFrame.to_excel.

Read data – the saved Excel file is read back with pd.read_excel so that further processing can be performed without repeatedly scraping the web page.

Clean data – missing scores are filled with 0 via df.fillna(0), ensuring that subsequent calculations are not affected by NaN values.

Process data – total and average scores for each student are computed using df.iloc[:, 2:11].sum(axis=1) and df.iloc[:, 2:11].mean(axis=1), and the results are added as new columns "总分" and "平均分".

Visualize data – distribution histograms for total scores and average scores are plotted with matplotlib. The tutorial shows how to set a Chinese font (e.g., 'Songti SC') to display labels correctly, and how to arrange the two histograms in a 2×1 subplot layout.

Summary – the final section reflects on the usefulness of Hunyuan as a coding assistant, noting that while the model can generate functional code quickly, developers still need to verify and adjust the output for specific business logic.

The complete Python script produced with Hunyuan’s assistance is shown below:

import pandas as pd
import matplotlib.pyplot as plt
# 读取Excel中数据
df = pd.read_excel("output.xlsx", sheet_name="Sheet1")
# 对空值进行填充
df = df.fillna(0)
# 计算总分和平均分
mean_values = df.iloc[:, 2:11].mean(axis=1)
sum_values = df.iloc[:, 2:11].sum(axis=1)
df["总分"] = sum_values
df["平均分"] = mean_values
# 设置中文字体
plt.rcParams['font.sans-serif'] = ['Songti SC']
plt.rcParams['axes.unicode_minus'] = False
# 创建一个2x1的子图表布局
fig, axes = plt.subplots(nrows=2, ncols=1, figsize=(6, 8))
# 绘制总分直方图
axes[0].hist(df['总分'], bins=20, color='blue', edgecolor='black', alpha=0.7)
axes[0].set_title('总分')
axes[0].set_xlabel('分数')
axes[0].set_ylabel('人数')
# 绘制平均分直方图
axes[1].hist(df['平均分'], bins=20, color='red', edgecolor='black', alpha=0.7)
axes[1].set_title('平均分')
axes[1].set_xlabel('分数')
axes[1].set_ylabel('人数')
# 显示图表
plt.tight_layout()
plt.show()

Overall, the example shows that large language models like Hunyuan can significantly accelerate development of data‑analysis scripts by generating boiler‑plate code, handling library APIs, and producing visualizations, while still requiring human oversight for domain‑specific adjustments.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Python data analysis large language model Data Visualization Web Scraping Matplotlib Pandas

Written by

Tencent Cloud Developer

Official Tencent Cloud community account that brings together developers, shares practical tech insights, and fosters an influential tech exchange community.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.