Fundamentals 6 min read

How to Group and Order Pandas Data by Original Appearance – 6 Clever Methods

This article demonstrates six different ways to use Pandas for grouping and ordering the values in a DataFrame column according to their original occurrence order, providing complete code examples, explanations, and visual results for each method.

Python Crawling & Data Mining

May 31, 2022

How to Group and Order Pandas Data by Original Appearance – 6 Clever Methods

Introduction

The author presents a Pandas data‑processing challenge: given a data column, create a new column that lists the values in the order they first appear, preserving duplicates. The initial DataFrame and expected output are shown.

import pandas as pd

df = pd.DataFrame({
    'data': ['A1', 'D3', 'B2', 'C4', 'A1', 'A2', 'B2', 'B3', 'C3', 'C4', 'D5', 'D3'],
    'new': ['A1', 'A1', 'D3', 'D3', 'B2', 'B2', 'C4', 'C4', 'A2', 'B3', 'C3', 'D5']
})
print(df)
# Please add your code. The "new" column should contain the grouped‑and‑ordered result.
print(df)

Resulting output is displayed in the following image:

Method 1

A solution contributed by "猫药师Kelly" is shown in the image below.

Method 2

df['newnew'] = sum([[k]*v for k, v in Counter(df['data']).items()], [])

The resulting DataFrame is shown in the following screenshot:

Method 3

import pandas as pd
from collections import Counter
from itertools import chain

df = pd.DataFrame({
    'data': ['A1', 'D3', 'B2', 'C4', 'A1', 'A2', 'B2', 'B3', 'C3', 'C4', 'D5', 'D3'],
    'new': ['A1', 'A1', 'D3', 'D3', 'B2', 'B2', 'C4', 'C4', 'A2', 'B3', 'C3', 'D5']
})
print(df)
df['newnew'] = [*chain(*([k]*v for k, v in Counter(df['data']).items()))]
print(df)

Result screenshot:

Method 4

import pandas as pd

df = pd.DataFrame({
    'data': ['A1', 'D3', 'B2', 'C4', 'A1', 'A2', 'B2', 'B3', 'C3', 'C4', 'D5', 'D3'],
    'new': ['A1', 'A1', 'D3', 'D3', 'B2', 'B2', 'C4', 'C4', 'A2', 'B3', 'C3', 'D5']
})
print(df)
df['new2'] = df['data'].unique().repeat(df['data'].value_counts(sort=False))
print(df)

Result screenshot:

Method 5

import pandas as pd

df = pd.DataFrame({
    'data': ['A1', 'D3', 'B2', 'C4', 'A1', 'A2', 'B2', 'B3', 'C3', 'C4', 'D5', 'D3'],
    'new': ['A1', 'A1', 'D3', 'D3', 'B2', 'B2', 'C4', 'C4', 'A2', 'B3', 'C3', 'D5']
})
print(df)
df['new3'] = df['data'].astype('category').cat.reorder_categories(df['data'].unique()).sort_values().values
print(df)

Result screenshot:

Method 6

import pandas as pd

df = pd.DataFrame({
    'data': ['A1', 'D3', 'B2', 'C4', 'A1', 'A2', 'B2', 'B3', 'C3', 'C4', 'D5', 'D3'],
    'new': ['A1', 'A1', 'D3', 'D3', 'B2', 'B2', 'C4', 'C4', 'A2', 'B3', 'C3', 'D5']
})
print(df)
df['new4'] = sorted(df['data'].tolist(), key=df['data'].tolist().index)
print(df)

Result screenshot:

Conclusion

The article showcases six distinct Pandas techniques for generating a column that reflects the original order of elements in another column, illustrating each approach with full code and visual output, and encourages readers to experiment and share alternative solutions.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Python ordering dataframe Pandas grouping pandas-tips data-manipulation

Written by

Python Crawling & Data Mining

Life's short, I code in Python. This channel shares Python web crawling, data mining, analysis, processing, visualization, automated testing, DevOps, big data, AI, cloud computing, machine learning tools, resources, news, technical articles, tutorial videos and learning materials. Join us!

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.