Using pandas to Convert Excel Birthdates to Age and Perform Grouped Statistics
This tutorial demonstrates how to read an Excel file with pandas, convert an 8‑digit birthdate column to datetime, calculate age, and then use groupby to produce aggregated statistics by age and gender, including combined summaries for all students and females only.
First, the Excel file 学生信息表.xlsx is loaded using pandas.read_excel :
<code>import pandas as pd
file_name = r'学生信息表.xlsx'
data = pd.read_excel(file_name)
</code>Next, the 出生日期 column, which stores dates as an 8‑digit string (YYYYMMDD), is converted to a proper datetime type with pd.to_datetime :
<code>data['出生日期'] = pd.to_datetime(data['出生日期'], format='%Y%m%d')
</code>To calculate each student's age, the current year is obtained via the datetime module and subtracted from the birth year:
<code>import datetime as dt
now = dt.datetime.today().year
data['周岁'] = now - data['出生日期'].dt.year
</code>For grouped statistics, the data is grouped by both 周岁 (age) and 性别 (gender) using groupby , and the count of 学生姓名 is aggregated:
<code>age = data.groupby(['周岁', '性别'])
age_group = age.agg({'学生姓名':'count'}).T
</code>If a summary of total counts per age is needed, a separate groupby on 周岁 is performed:
<code>age_single = data.groupby(['周岁'])
age_single_group = age_single.agg({'学生姓名':'count'}).T
age_single_group.index = ['合计']
</code>To obtain counts for females only, the dataset is filtered where 性别 == '女' before grouping:
<code>age_single_female = data[data['性别'] == '女'].groupby(['周岁'])
age_single_female_group = age_single_female.agg({'学生姓名':'count'}).T
age_single_female_group.index = ['仅女生']
</code>Finally, the two summary tables are concatenated vertically to produce a combined view:
<code>age_union = pd.concat([age_single_group, age_single_female_group], join='outer')
age_union.fillna(0)
</code>The resulting tables show age‑wise counts for all students, gender‑wise breakdowns, and a separate count for female students, illustrating how pandas simplifies data cleaning, transformation, and aggregation tasks.
Python Programming Learning Circle
A global community of Chinese Python developers offering technical articles, columns, original video tutorials, and problem sets. Topics include web full‑stack development, web scraping, data analysis, natural language processing, image processing, machine learning, automated testing, DevOps automation, and big data.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.