Fundamentals 6 min read

Useful but Uncommon Pandas Functions: to_period, cumsum, groupby, and Category dtype

This article demonstrates several lesser‑known but highly useful Pandas functions—including to_period for period conversion, cumsum with groupby for cumulative sums, and the memory‑efficient Category dtype—through a step‑by‑step example DataFrame with code snippets and output illustrations.

Python Programming Learning Circle

May 18, 2022

Useful but Uncommon Pandas Functions: to_period, cumsum, groupby, and Category dtype

In this tutorial we showcase some uncommon yet very handy Pandas functions using a sample DataFrame with three columns (date, class, amount) and 100 rows.

First, we create the DataFrame:

import numpy as np
import pandas as pd
df = pd.DataFrame({
    "date": pd.date_range(start="2021-11-20", periods=100, freq="D"),
    "class": ["A","B","C","D"] * 25,
    "amount": np.random.randint(10, 100, size=100)
})

df.head()

The DataFrame contains a continuous date column, a categorical class column with four distinct values, and a random integer amount column.

1. to_period

The to_period method converts datetime values to a specific time period such as month ("M") or quarter ("Q"), enabling proper time‑series grouping.

df["month"] = df["date"].dt.to_period("M")
df["quarter"] = df["date"].dt.to_period("Q")

df.head()

We can view the counts of each month and quarter:

df["month"].value_counts()
# output
2021-12   31
2022-01   31
2022-02   27
2021-11   11
Freq: M, Name: month, dtype: int64

--------------------------
df["quarter"].value_counts()
# output
2022Q1   58
2021Q4   42
Freq: Q-DEC, Name: quarter, dtype: int64

2. cumsum and groupby

The cumsum function computes the cumulative sum of a column. Applied directly it gives the running total of amount:

df["cumulative_sum"] = df["amount"].cumsum()
df.head()

To obtain cumulative sums per class, we combine groupby with cumsum: df["class_cum_sum"] = df.groupby("class")["amount"].cumsum() Viewing the result for class "A”:

df[df["class"] == "A"].head()

The new column class_cum_sum contains cumulative totals calculated separately for each class.

3. Category dtype

Columns with a limited set of values can be stored as the category dtype, which uses less memory than the default object type.

df.dtypes
# output
date                datetime64[ns]
class                       object
amount                       int64
month                    period[M]
quarter               period[Q-DEC]
cumulative_sum               int64
class_cum_sum                int64

Convert the class column to a categorical type:

df["class_category"] = df["class"].astype("category")
df.dtypes
# output includes
class_category    category

Memory usage comparison shows the categorical column consumes less than half the memory of the object column:

df.memory_usage()
# output
Index               128
date                800
class               800
amount              800
month                800
quarter              800
cumulative_sum      800
class_cum_sum       800
class_category      304
dtype: int64

Although the difference is modest for this small dataset (496 bytes), it scales dramatically with larger data, providing significant space savings.

END

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

dataframe pandas category dtype cumsum to_period

Written by

Python Programming Learning Circle

A global community of Chinese Python developers offering technical articles, columns, original video tutorials, and problem sets. Topics include web full‑stack development, web scraping, data analysis, natural language processing, image processing, machine learning, automated testing, DevOps automation, and big data.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.