Big Data 6 min read

Accelerating Pandas apply: Up to 600× Speedup with Swifter, Vectorization, dtype Conversion, and .values

This article demonstrates how to dramatically speed up the slow pandas apply function—by up to six hundred times—using Swifter for parallel execution, vectorized pandas/numpy operations, dtype downcasting, and direct .values array manipulation, with detailed timing comparisons.

Python Programming Learning Circle
Python Programming Learning Circle
Python Programming Learning Circle
Accelerating Pandas apply: Up to 600× Speedup with Swifter, Vectorization, dtype Conversion, and .values

Although packages like Dask and cuDF can accelerate data processing, many users still rely on pandas, whose apply function is notoriously slow; this article shows techniques to speed it up by about 600×.

Baseline (Apply only): Applying a custom function to a 1,000,000‑row DataFrame takes 18.4 seconds .

import pandas as pd
import numpy as np

df = pd.DataFrame(np.random.randint(0, 11, size=(1000000, 5)),
                  columns=('a','b','c','d','e'))

def func(a,b,c,d,e):
    if e == 10:
        return c*d
    elif e < 10 and e >= 5:
        return c+d
    else:
        return a+b

df['new'] = df.apply(lambda x: func(x['a'],x['b'],x['c'],x['d'],x['e']), axis=1)

1. Swifter parallelization: Installing and using swifter reduces the wall time to 7.67 seconds .

import swifter

df['new'] = df.swifter.apply(lambda x: func(x['a'],x['b'],x['c'],x['d'],x['e']), axis=1)

2. Vectorization with pandas/numpy: Rewriting the logic as vectorized column operations brings the execution time down to 421 ms .

df['new'] = df['c'] * df['d']  # e == 10
mask = df['e'] < 10
df.loc[mask, 'new'] = df['c'] + df['d']
mask = df['e'] < 5
df.loc[mask, 'new'] = df['a'] + df['b']

3. Downcasting column dtypes to int16 : This further cuts the time to 116 ms .

for col in ('a','b','c','d'):
    df[col] = df[col].astype(np.int16)

# same vectorized operations as above

4. Using .values (numpy arrays): Performing calculations on the underlying numpy arrays reduces the wall time to 74.9 ms .

df['new'] = df['c'].values * df['d'].values
mask = df['e'].values < 10
df.loc[mask, 'new'] = df['c'] + df['d']
mask = df['e'].values < 5
df.loc[mask, 'new'] = df['a'] + df['b']

Experiment summary: The timings improve from 18.4 s (plain apply) → 7.67 s (apply + Swifter) → 421 ms (vectorized) → 116 ms (vectorized + int16) → 74.9 ms (vectorized + int16 + .values), demonstrating that a combination of parallelism, vectorization, dtype optimization, and direct array access can accelerate pandas workflows by several orders of magnitude.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

performancedataframevectorizationpandasapplyswifter
Python Programming Learning Circle
Written by

Python Programming Learning Circle

A global community of Chinese Python developers offering technical articles, columns, original video tutorials, and problem sets. Topics include web full‑stack development, web scraping, data analysis, natural language processing, image processing, machine learning, automated testing, DevOps automation, and big data.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.