Fundamentals 8 min read

How to Group and Map Data in Pandas: 5 Practical Methods

This article walks through a common Python data‑processing challenge—grouping numeric identifiers with corresponding strings—by presenting five distinct Pandas‑based solutions, complete with code snippets and visual results, enabling readers to efficiently transform raw lists into organized dictionaries.

Python Crawling & Data Mining
Python Crawling & Data Mining
Python Crawling & Data Mining
How to Group and Map Data in Pandas: 5 Practical Methods

1. Introduction

A question was posted in a Python community about how to process the following data, where a list of numeric identifiers needs to be mapped to a list of string codes.

num = [1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 2.0, 2.0, 2.0, 2.0, 2.0, 2.0, 2.0, 2.0, 3.0, 3.0, 3.0, 3.0, 3.0, 3.0]

data = ['201825301001', '201825301002', '201825301004', '201825301005', '201825301006', '201825301007', '201825301008', '201825301009', '201825301010', '201825301011', '201825301012', '201825301013', '201825301014', '201825301015', '201825301016', '201825301017', '201825301018', '201825301019', '201825305001', '201825305002']

2. Implementation

Method 1

A suggestion was given to read the data from an Excel file, group by the numeric column, and convert the groups to a dictionary.

import pandas as pd

df = pd.read_excel('1.xlsx', names=['num', 'date'])
df = df.groupby('num').agg(list)
res = df.to_dict()['date']
print(res)

The output matches the expected mapping.

Method 2

Another approach uses a simple loop to build the dictionary.

num=[1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 2.0, 2.0, 2.0, 2.0, 2.0, 2.0, 2.0, 2.0, 3.0, 3.0, 3.0, 3.0, 3.0, 3.0]
num=[int(i) for i in num]
data=['201825301001', '201825301002', '201825301004', '201825301005', '201825301006', '201825301007', '201825301008', '201825301009', '201825301010', '201825301011', '201825301012', '201825301013', '201825301014', '201825301015', '201825301016', '201825301017', '201825301018', '201825301019', '201825305001', '201825305002']
result={}
for k,v in zip(num,data):
    if k in result.keys():
        result.get(k).append(v)
    else:
        result[k]=[v]
print(result)

Method 3

Using itertools.groupby to achieve the same result in a more functional style.

from itertools import groupby
num=[1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 2.0, 2.0, 2.0, 2.0, 2.0, 2.0, 2.0, 2.0, 3.0, 3.0, 3.0, 3.0, 3.0, 3.0]
data=['201825301001', '201825301002', '201825301004', '201825301005', '201825301006', '201825301007', '201825301008', '201825301009', '201825301010', '201825301011', '201825301012', '201825301013', '201825301014', '201825301015', '201825301016', '201825301017', '201825301018', '201825301019', '201825305001', '201825305002']
result = {k: [i[1] for i in v] for k, v in groupby(zip(num, data), key=lambda x: int(x[0]))}
print(result)

Method 4

A variant that builds the dictionary with a single pass.

from itertools import groupby
num=[1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 2.0, 2.0, 2.0, 2.0, 2.0, 2.0, 2.0, 2.0, 3.0, 3.0, 3.0, 3.0, 3.0, 3.0]
data=['201825301001', '201825301002', '201825301004', '201825301005', '201825301006', '201825301007', '201825301008', '201825301009', '201825301010', '201825301011', '201825301012', '201825301013', '201825301014', '201825301015', '201825301016', '201825301017', '201825301018', '201825301019', '201825305001', '201825305002']
result={}
for k,v in zip(num,data):
    result[k]=result.get(k,[])+[v]
result={int(k):result.get(k,[])+[v] for k,v in zip(num,data)}
print(result)

Method 5

Leveraging Pandas DataFrame grouping directly.

num=[1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 2.0, 2.0, 2.0, 2.0, 2.0, 2.0, 2.0, 2.0, 3.0, 3.0, 3.0, 3.0, 3.0, 3.0]
data=['201825301001', '201825301002', '201825301004', '201825301005', '201825301006', '201825301007', '201825301008', '201825301009', '201825301010', '201825301011', '201825301012', '201825301013', '201825301014', '201825301015', '201825301016', '201825301017', '201825301018', '201825301019', '201825305001', '201825305002']
import pandas as pd
df = pd.DataFrame({'num': num, 'data': data})
df = df.groupby('num').agg(list)
res = df.to_dict()['data']
print(res)

3. Summary

The article presents five concrete Pandas‑based solutions for converting parallel lists of numeric keys and string values into a dictionary that groups strings by their numeric key, demonstrating multiple coding styles—from simple loops to groupby and DataFrame aggregation—so readers can choose the approach that best fits their workflow.

Pythondata processingCode examplesdictionarygroupby
Python Crawling & Data Mining
Written by

Python Crawling & Data Mining

Life's short, I code in Python. This channel shares Python web crawling, data mining, analysis, processing, visualization, automated testing, DevOps, big data, AI, cloud computing, machine learning tools, resources, news, technical articles, tutorial videos and learning materials. Join us!

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.