Fundamentals 6 min read

How to Group and Order Pandas Data by Original Appearance – 6 Clever Methods

This article demonstrates six different ways to use Pandas for grouping and ordering the values in a DataFrame column according to their original occurrence order, providing complete code examples, explanations, and visual results for each method.

Python Crawling & Data Mining
Python Crawling & Data Mining
Python Crawling & Data Mining
How to Group and Order Pandas Data by Original Appearance – 6 Clever Methods

Introduction

The author presents a Pandas data‑processing challenge: given a data column, create a new column that lists the values in the order they first appear, preserving duplicates. The initial DataFrame and expected output are shown.

import pandas as pd

df = pd.DataFrame({
    'data': ['A1', 'D3', 'B2', 'C4', 'A1', 'A2', 'B2', 'B3', 'C3', 'C4', 'D5', 'D3'],
    'new': ['A1', 'A1', 'D3', 'D3', 'B2', 'B2', 'C4', 'C4', 'A2', 'B3', 'C3', 'D5']
})
print(df)
# Please add your code. The "new" column should contain the grouped‑and‑ordered result.
print(df)

Resulting output is displayed in the following image:

Result image
Result image

Method 1

A solution contributed by "猫药师Kelly" is shown in the image below.

Method 1 result
Method 1 result

Method 2

df['newnew'] = sum([[k]*v for k, v in Counter(df['data']).items()], [])

The resulting DataFrame is shown in the following screenshot:

Method 2 result
Method 2 result

Method 3

import pandas as pd
from collections import Counter
from itertools import chain

df = pd.DataFrame({
    'data': ['A1', 'D3', 'B2', 'C4', 'A1', 'A2', 'B2', 'B3', 'C3', 'C4', 'D5', 'D3'],
    'new': ['A1', 'A1', 'D3', 'D3', 'B2', 'B2', 'C4', 'C4', 'A2', 'B3', 'C3', 'D5']
})
print(df)
df['newnew'] = [*chain(*([k]*v for k, v in Counter(df['data']).items()))]
print(df)

Result screenshot:

Method 3 result
Method 3 result

Method 4

import pandas as pd

df = pd.DataFrame({
    'data': ['A1', 'D3', 'B2', 'C4', 'A1', 'A2', 'B2', 'B3', 'C3', 'C4', 'D5', 'D3'],
    'new': ['A1', 'A1', 'D3', 'D3', 'B2', 'B2', 'C4', 'C4', 'A2', 'B3', 'C3', 'D5']
})
print(df)
df['new2'] = df['data'].unique().repeat(df['data'].value_counts(sort=False))
print(df)

Result screenshot:

Method 4 result
Method 4 result

Method 5

import pandas as pd

df = pd.DataFrame({
    'data': ['A1', 'D3', 'B2', 'C4', 'A1', 'A2', 'B2', 'B3', 'C3', 'C4', 'D5', 'D3'],
    'new': ['A1', 'A1', 'D3', 'D3', 'B2', 'B2', 'C4', 'C4', 'A2', 'B3', 'C3', 'D5']
})
print(df)
df['new3'] = df['data'].astype('category').cat.reorder_categories(df['data'].unique()).sort_values().values
print(df)

Result screenshot:

Method 5 result
Method 5 result

Method 6

import pandas as pd

df = pd.DataFrame({
    'data': ['A1', 'D3', 'B2', 'C4', 'A1', 'A2', 'B2', 'B3', 'C3', 'C4', 'D5', 'D3'],
    'new': ['A1', 'A1', 'D3', 'D3', 'B2', 'B2', 'C4', 'C4', 'A2', 'B3', 'C3', 'D5']
})
print(df)
df['new4'] = sorted(df['data'].tolist(), key=df['data'].tolist().index)
print(df)

Result screenshot:

Method 6 result
Method 6 result

Conclusion

The article showcases six distinct Pandas techniques for generating a column that reflects the original order of elements in another column, illustrating each approach with full code and visual output, and encourages readers to experiment and share alternative solutions.

PythonorderingDataFramepandasgroupingpandas-tipsdata-manipulation
Python Crawling & Data Mining
Written by

Python Crawling & Data Mining

Life's short, I code in Python. This channel shares Python web crawling, data mining, analysis, processing, visualization, automated testing, DevOps, big data, AI, cloud computing, machine learning tools, resources, news, technical articles, tutorial videos and learning materials. Join us!

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.