Fundamentals 8 min read

How to Group and Map Data in Pandas: 5 Practical Methods

This article walks through a common Python data‑processing challenge—grouping numeric identifiers with corresponding strings—by presenting five distinct Pandas‑based solutions, complete with code snippets and visual results, enabling readers to efficiently transform raw lists into organized dictionaries.

Python Crawling & Data Mining

Apr 12, 2025

How to Group and Map Data in Pandas: 5 Practical Methods

1. Introduction

A question was posted in a Python community about how to process the following data, where a list of numeric identifiers needs to be mapped to a list of string codes.

num = [1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 2.0, 2.0, 2.0, 2.0, 2.0, 2.0, 2.0, 2.0, 3.0, 3.0, 3.0, 3.0, 3.0, 3.0]

data = ['201825301001', '201825301002', '201825301004', '201825301005', '201825301006', '201825301007', '201825301008', '201825301009', '201825301010', '201825301011', '201825301012', '201825301013', '201825301014', '201825301015', '201825301016', '201825301017', '201825301018', '201825301019', '201825305001', '201825305002']

2. Implementation

Method 1

A suggestion was given to read the data from an Excel file, group by the numeric column, and convert the groups to a dictionary.

import pandas as pd

df = pd.read_excel('1.xlsx', names=['num', 'date'])
df = df.groupby('num').agg(list)
res = df.to_dict()['date']
print(res)

The output matches the expected mapping.

Method 2

Another approach uses a simple loop to build the dictionary.

num=[1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 2.0, 2.0, 2.0, 2.0, 2.0, 2.0, 2.0, 2.0, 3.0, 3.0, 3.0, 3.0, 3.0, 3.0]
num=[int(i) for i in num]
data=['201825301001', '201825301002', '201825301004', '201825301005', '201825301006', '201825301007', '201825301008', '201825301009', '201825301010', '201825301011', '201825301012', '201825301013', '201825301014', '201825301015', '201825301016', '201825301017', '201825301018', '201825301019', '201825305001', '201825305002']
result={}
for k,v in zip(num,data):
    if k in result.keys():
        result.get(k).append(v)
    else:
        result[k]=[v]
print(result)

Method 3

Using itertools.groupby to achieve the same result in a more functional style.

from itertools import groupby
num=[1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 2.0, 2.0, 2.0, 2.0, 2.0, 2.0, 2.0, 2.0, 3.0, 3.0, 3.0, 3.0, 3.0, 3.0]
data=['201825301001', '201825301002', '201825301004', '201825301005', '201825301006', '201825301007', '201825301008', '201825301009', '201825301010', '201825301011', '201825301012', '201825301013', '201825301014', '201825301015', '201825301016', '201825301017', '201825301018', '201825301019', '201825305001', '201825305002']
result = {k: [i[1] for i in v] for k, v in groupby(zip(num, data), key=lambda x: int(x[0]))}
print(result)

Method 4

A variant that builds the dictionary with a single pass.

from itertools import groupby
num=[1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 2.0, 2.0, 2.0, 2.0, 2.0, 2.0, 2.0, 2.0, 3.0, 3.0, 3.0, 3.0, 3.0, 3.0]
data=['201825301001', '201825301002', '201825301004', '201825301005', '201825301006', '201825301007', '201825301008', '201825301009', '201825301010', '201825301011', '201825301012', '201825301013', '201825301014', '201825301015', '201825301016', '201825301017', '201825301018', '201825301019', '201825305001', '201825305002']
result={}
for k,v in zip(num,data):
    result[k]=result.get(k,[])+[v]
result={int(k):result.get(k,[])+[v] for k,v in zip(num,data)}
print(result)

Method 5

Leveraging Pandas DataFrame grouping directly.

num=[1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 2.0, 2.0, 2.0, 2.0, 2.0, 2.0, 2.0, 2.0, 3.0, 3.0, 3.0, 3.0, 3.0, 3.0]
data=['201825301001', '201825301002', '201825301004', '201825301005', '201825301006', '201825301007', '201825301008', '201825301009', '201825301010', '201825301011', '201825301012', '201825301013', '201825301014', '201825301015', '201825301016', '201825301017', '201825301018', '201825301019', '201825305001', '201825305002']
import pandas as pd
df = pd.DataFrame({'num': num, 'data': data})
df = df.groupby('num').agg(list)
res = df.to_dict()['data']
print(res)

3. Summary

The article presents five concrete Pandas‑based solutions for converting parallel lists of numeric keys and string values into a dictionary that groups strings by their numeric key, demonstrating multiple coding styles—from simple loops to groupby and DataFrame aggregation—so readers can choose the approach that best fits their workflow.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

python data processing code examples dictionary groupby

Written by

Python Crawling & Data Mining

Life's short, I code in Python. This channel shares Python web crawling, data mining, analysis, processing, visualization, automated testing, DevOps, big data, AI, cloud computing, machine learning tools, resources, news, technical articles, tutorial videos and learning materials. Join us!

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.