How to Reshape Complex Python Dictionaries into Structured Data with Simple Code
This article walks through a real‑world Python dictionary processing challenge, demonstrates several solutions—from basic loops to itertools grouping and Pandas aggregation—provides complete code snippets, and explains how to obtain the desired structured output.
1. Introduction
Hello, I am PiPi. In a Python community a member asked how to process a list of dictionaries that contain time, content and a nested speaker list. The original data is shown below.
The expected result is a merged dictionary where speakers from the same time and content are combined, as illustrated in the following image.
a = [
{
'time': '8:30-9:30',
'content': '开场致词',
'speaker': [{'name': '李明', 'hs': '重庆附属永川'}]
},
{
'time': '8:30-9:30',
'content': '开场致词',
'speaker': [{'name': '主席:李伟', 'hs': '苏州附属院'}]
},
{
'time': '8:30-9:30',
'content': '开场致词',
'speaker': [{'name': '王斌', 'hs': '佛山市院'}]
}
]2. Basic Implementation
The first solution, shared by a community member, uses a simple loop to collect speakers under the same key.
My own implementation follows the same idea but adds explicit merging of the static part of the dictionary.
a = [
{
'time': '8:30-9:30',
'content': '开场致词',
'speaker': [{'name': '李明', 'hs': '重庆附属永川'}]
},
{
'time': '8:30-9:30',
'content': '开场致词',
'speaker': [{'name': '主席:李伟', 'hs': '苏州附属院'}]
},
{
'time': '8:30-9:30',
'content': '开场致词',
'speaker': [{'name': '王斌', 'hs': '佛山市院'}]
}
]
new_dict = {}
new_lst = []
for item in a:
new_dict.setdefault('speaker', []).append(item['speaker'])
front_dict = {'time': '8:30-9:30', 'content': '开场致词'}
final_dict = {**front_dict, **new_dict}
print(final_dict)The result is correct, although the code contains some redundancy.
3. Using Pandas
A more concise solution leverages Pandas to group and aggregate the speakers.
4. Optimized Solutions
Another community member refined the code with itertools.groupby and operator.itemgetter to produce the merged structure in a single expression.
from itertools import groupby
from operator import itemgetter
[dict(zip(('time','content','speaker'),
(*key, sum([i['speaker'] for i in value], []))))
for key, value in groupby(a, itemgetter('time','content'))]The Pandas approach can also be written more compactly:
import pandas as pd
pd.DataFrame(a).groupby(['time','content']).speaker.sum().reset_index().to_dict(orient='records')5. Conclusion
This article presented a practical Python dictionary processing problem, explored multiple implementations—including basic loops, itertools grouping, and Pandas aggregation—and provided complete, runnable code snippets that achieve the desired merged output.
Python Crawling & Data Mining
Life's short, I code in Python. This channel shares Python web crawling, data mining, analysis, processing, visualization, automated testing, DevOps, big data, AI, cloud computing, machine learning tools, resources, news, technical articles, tutorial videos and learning materials. Join us!
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
