Fundamentals 6 min read

Master Multi-Level Grouping in pandas: Turn Sub‑Conditions into Column Names

This tutorial shows how to group pandas data by two columns, count occurrences, and reshape the result so that the secondary condition becomes column headers, presenting three practical methods with code examples and visual explanations.

Python Crawling & Data Mining

Apr 15, 2022

Master Multi-Level Grouping in pandas: Turn Sub‑Conditions into Column Names

Data Requirement

Need to transform the given data: group by two selected columns, count, then use the second column as column names to produce a new DataFrame.

import pandas as pd
data = {"Name": ["Jack","Jack","Jason","Jason","Rose","Rose"],
        "Course": ["Chinese","Chinese","Chinese","Russian","English","English"],
        "Date": ["20220112","20220113","20220112","20220114","20220112","20220113"]}
df = pd.DataFrame(data)

Requirement Decomposition

To convert rows to column names, consider using unstack with appropriate index, and for counting use value_counts or groupby followed by count.

Solution

Method 1

Use value_counts then unstack to reshape the data.

After unstacking the Course index, adjust column names, reset the index, and finalize the DataFrame.

df = df.value_counts(['Name','Course']).unstack('Course')
df.columns = df.columns.values
df.reset_index(inplace=True)

Method 2

Similar logic using groupby aggregation then unstack.

The result matches Method 1, demonstrating a chainable workflow.

df = df.groupby(['Name','Course'])['Course'].count().unstack('Course').reset_index()
df.columns = df.columns.values

Method 3

Use pandas pivot_table or crosstab, which act like Excel pivot tables.

pd.pivot_table(df, index=['Name'], columns=['Course'], aggfunc='count')

Alternatively, crosstab with dropna=False keeps zero values.

df = pd.crosstab(df['Name'], df['Course'], values=df['Course'], dropna=False, aggfunc='count')
df.columns = df.columns.values
df.reset_index(inplace=True)

Summary

This article demonstrates several pandas techniques for grouping by multiple columns, counting, and reshaping the result so that secondary conditions become column headers, emphasizing the importance of selecting the most suitable method for a given problem.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Pandas data manipulation Multi-Index pivot table unstack

Written by

Python Crawling & Data Mining

Life's short, I code in Python. This channel shares Python web crawling, data mining, analysis, processing, visualization, automated testing, DevOps, big data, AI, cloud computing, machine learning tools, resources, news, technical articles, tutorial videos and learning materials. Join us!

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.