Fundamentals 7 min read

How to Group and Aggregate Data in Pandas: 5 Practical Methods

This article walks through a Python data‑processing problem and presents five distinct pandas‑based solutions, each with complete code snippets and output screenshots, helping readers efficiently group and aggregate tabular data.

Python Crawling & Data Mining

May 28, 2022

How to Group and Aggregate Data in Pandas: 5 Practical Methods

1. Introduction

In a Python community a user asked how to process data shown in the screenshot below. The raw data consists of a list of numeric categories and a list of corresponding IDs.

2. Implementation

Method 1

Using pandas to read an Excel file, group by the numeric column and convert the groups to a dictionary.

import pandas as pd

df = pd.read_excel('1.xlsx', names=['num', 'date'])
df = df.groupby("num").agg(list)
res = df.to_dict()["date"]
print(res)

Method 2

Iterating over the two lists and building a dictionary manually.

num=[1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 2.0, 2.0, 2.0, 2.0, 2.0, 2.0, 2.0, 2.0, 3.0, 3.0, 3.0, 3.0, 3.0, 3.0]
data=['201825301001', '201825301002', '201825301004', '201825301005', '201825301006', '201825301007', '201825301008', '201825301009', '201825301010', '201825301011', '201825301012', '201825301013', '201825301014', '201825301015', '201825301016', '201825301017', '201825301018', '201825301019', '201825305001', '201825305002']
result={}
for k,v in zip(num,data):
    if k in result.keys():
        result.get(k).append(v)
    else:
        result[k]=[v]
print(result)

Method 3

Using itertools.groupby to group the pairs.

from itertools import groupby
num=[1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 2.0, 2.0, 2.0, 2.0, 2.0, 2.0, 2.0, 2.0, 3.0, 3.0, 3.0, 3.0, 3.0, 3.0]
data=['201825301001', '201825301002', '201825301004', '201825301005', '201825301006', '201825301007', '201825301008', '201825301009', '201825301010', '201825301011', '201825301012', '201825301013', '201825301014', '201825301015', '201825301016', '201825301017', '201825301018', '201825301019', '201825305001', '201825305002']
result = {k: [i[1] for i in v] for k, v in groupby(zip(num, data), key=lambda x: int(x[0]))}
print(result)

Method 4

Building the dictionary with a single comprehension.

from itertools import groupby
# same num and data as before
result={}
for k,v in zip(num,data):
    result[k]=result.get(k,[])+[v]

result={int(k):result.get(k,[])+[v] for k,v in zip(num,data)}
print(result)

Method 5

Creating a pandas DataFrame and using groupby with list aggregation.

df = pd.DataFrame({'num': num, 'data': data})
df = df.groupby("num").agg(list)
res = df.to_dict()["data"]
print(res)

3. Conclusion

The article demonstrates five different ways to group and aggregate data in pandas, providing clear code snippets and resulting outputs, helping readers solve similar data‑processing problems.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Code examples Pandas Aggregation data grouping

Written by

Python Crawling & Data Mining

Life's short, I code in Python. This channel shares Python web crawling, data mining, analysis, processing, visualization, automated testing, DevOps, big data, AI, cloud computing, machine learning tools, resources, news, technical articles, tutorial videos and learning materials. Join us!

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.