How to Group and Map Data in Pandas: 5 Practical Methods
This article walks through a common Python data‑processing challenge—grouping numeric identifiers with corresponding strings—by presenting five distinct Pandas‑based solutions, complete with code snippets and visual results, enabling readers to efficiently transform raw lists into organized dictionaries.
1. Introduction
A question was posted in a Python community about how to process the following data, where a list of numeric identifiers needs to be mapped to a list of string codes.
num = [1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 2.0, 2.0, 2.0, 2.0, 2.0, 2.0, 2.0, 2.0, 3.0, 3.0, 3.0, 3.0, 3.0, 3.0]
data = ['201825301001', '201825301002', '201825301004', '201825301005', '201825301006', '201825301007', '201825301008', '201825301009', '201825301010', '201825301011', '201825301012', '201825301013', '201825301014', '201825301015', '201825301016', '201825301017', '201825301018', '201825301019', '201825305001', '201825305002']2. Implementation
Method 1
A suggestion was given to read the data from an Excel file, group by the numeric column, and convert the groups to a dictionary.
import pandas as pd
df = pd.read_excel('1.xlsx', names=['num', 'date'])
df = df.groupby('num').agg(list)
res = df.to_dict()['date']
print(res)The output matches the expected mapping.
Method 2
Another approach uses a simple loop to build the dictionary.
num=[1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 2.0, 2.0, 2.0, 2.0, 2.0, 2.0, 2.0, 2.0, 3.0, 3.0, 3.0, 3.0, 3.0, 3.0]
num=[int(i) for i in num]
data=['201825301001', '201825301002', '201825301004', '201825301005', '201825301006', '201825301007', '201825301008', '201825301009', '201825301010', '201825301011', '201825301012', '201825301013', '201825301014', '201825301015', '201825301016', '201825301017', '201825301018', '201825301019', '201825305001', '201825305002']
result={}
for k,v in zip(num,data):
if k in result.keys():
result.get(k).append(v)
else:
result[k]=[v]
print(result)Method 3
Using itertools.groupby to achieve the same result in a more functional style.
from itertools import groupby
num=[1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 2.0, 2.0, 2.0, 2.0, 2.0, 2.0, 2.0, 2.0, 3.0, 3.0, 3.0, 3.0, 3.0, 3.0]
data=['201825301001', '201825301002', '201825301004', '201825301005', '201825301006', '201825301007', '201825301008', '201825301009', '201825301010', '201825301011', '201825301012', '201825301013', '201825301014', '201825301015', '201825301016', '201825301017', '201825301018', '201825301019', '201825305001', '201825305002']
result = {k: [i[1] for i in v] for k, v in groupby(zip(num, data), key=lambda x: int(x[0]))}
print(result)Method 4
A variant that builds the dictionary with a single pass.
from itertools import groupby
num=[1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 2.0, 2.0, 2.0, 2.0, 2.0, 2.0, 2.0, 2.0, 3.0, 3.0, 3.0, 3.0, 3.0, 3.0]
data=['201825301001', '201825301002', '201825301004', '201825301005', '201825301006', '201825301007', '201825301008', '201825301009', '201825301010', '201825301011', '201825301012', '201825301013', '201825301014', '201825301015', '201825301016', '201825301017', '201825301018', '201825301019', '201825305001', '201825305002']
result={}
for k,v in zip(num,data):
result[k]=result.get(k,[])+[v]
result={int(k):result.get(k,[])+[v] for k,v in zip(num,data)}
print(result)Method 5
Leveraging Pandas DataFrame grouping directly.
num=[1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 2.0, 2.0, 2.0, 2.0, 2.0, 2.0, 2.0, 2.0, 3.0, 3.0, 3.0, 3.0, 3.0, 3.0]
data=['201825301001', '201825301002', '201825301004', '201825301005', '201825301006', '201825301007', '201825301008', '201825301009', '201825301010', '201825301011', '201825301012', '201825301013', '201825301014', '201825301015', '201825301016', '201825301017', '201825301018', '201825301019', '201825305001', '201825305002']
import pandas as pd
df = pd.DataFrame({'num': num, 'data': data})
df = df.groupby('num').agg(list)
res = df.to_dict()['data']
print(res)3. Summary
The article presents five concrete Pandas‑based solutions for converting parallel lists of numeric keys and string values into a dictionary that groups strings by their numeric key, demonstrating multiple coding styles—from simple loops to groupby and DataFrame aggregation—so readers can choose the approach that best fits their workflow.
Python Crawling & Data Mining
Life's short, I code in Python. This channel shares Python web crawling, data mining, analysis, processing, visualization, automated testing, DevOps, big data, AI, cloud computing, machine learning tools, resources, news, technical articles, tutorial videos and learning materials. Join us!
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
