Big Data 14 min read

Advanced Pandas Data Manipulation Techniques

This article provides a comprehensive guide to using Pandas for complex queries, data type conversion, sorting, adding and modifying data, advanced filtering, iteration, and functional operations, offering numerous code examples that illustrate how to efficiently clean, transform, and analyze tabular data in Python.

Python Programming Learning Circle
Python Programming Learning Circle
Python Programming Learning Circle
Advanced Pandas Data Manipulation Techniques

Complex Queries

Demonstrates logical operations, slicing, .loc , .iloc , function filtering, comparison functions, df.query , df.eval , and df.filter for selecting data based on conditions.

<code># Logical operation examples
df.Q1 > 36
~(df.Q1 < 60) & (df['team'] == 'C')

# Slice with logical expression
df[df['Q1'] == 8]

# .loc example
df.loc[df['Q1'] > 90, 'Q1':]

# df.query example
df.query('Q1 > Q2 > 90')
</code>

Data Type Conversion

Shows how to infer types, specify dtypes, use astype , and convert to datetime or timedelta.

<code># Infer types
df = df.infer_objects()

# Specify dtypes on read
df = pd.read_excel(data, dtype={'team':'string','Q1':'int32'})

# Convert with astype
df['Q1'] = df['Q1'].astype('int32')

# Convert to datetime
t = pd.Series(['20200801','20200802'])
</code>

Data Sorting

Explains index sorting with sort_index , value sorting with sort_values , mixed sorting, and nsmallest / nlargest for selecting extreme values.

<code># Index sorting
df.sort_index()

# Value sorting
df.sort_values('Q4')

df.sort_values(by=['team','Q1'], ascending=[True, False])

# Smallest / largest
s.nsmallest(3)
df.nlargest(5, 'Q1')
</code>

Adding and Modifying Data

Details modifying values, replacing data, filling missing values, renaming columns/indexes, adding columns via assignment, insert , assign , evaluating expressions, adding rows, appending DataFrames, and deleting rows or NaNs.

<code># Modify a single value
df.iloc[0,0] = 'Lily'

# Replace values
df.replace(0, 5)

# Fill missing values
df.fillna(0)

df.rename(columns={'team':'class'})

df['foo'] = df.Q1 + df.Q2

df.insert(2, 'total', df.sum(1))

df.assign(total=df.sum(1), Q=100)

# Add a new row
df.loc[100] = ['tom','A',88,88,88,88]

# Append another DataFrame
df = df.append(df2)

# Drop NaNs
df.dropna()
</code>

Advanced Filtering

Introduces df.where , np.where , df.mask , and df.lookup for sophisticated filtering.

<code># df.where example
df.where(df > 70)

# np.where example
np.where(df >= 60, 'Pass', 'Fail')

# df.mask example
df.mask(s > 80)

# df.lookup example
df.lookup([1,3,4], ['Q1','Q2','Q3'])
</code>

Iteration

Covers iterating over Series, DataFrame rows with iterrows , itertuples , items , and column iteration.

<code># Iterate over a column
for i in df.name:
    print(i)

# iterrows
for idx, row in df.iterrows():
    print(idx, row['name'], row.Q1)

# itertuples
for row in df.itertuples():
    print(row)
</code>

Function Application

Describes pipe , apply , applymap , map , agg , transform , and copy methods for functional data processing.

<code># pipe chaining
(df.pipe(h)
   .pipe(g, arg1=a)
   .pipe(f, arg2=b, arg3=c))

# apply on a column
df.name.apply(lambda x: x.lower())

# applymap on entire DataFrame
df.applymap(len)

# map on a Series
df.team.map({'A':'Class1','B':'Class2'})

# aggregation
df.agg(['max','min'])

# transform
df.transform(lambda x: x*2)

# copy
df_copy = df.copy()
</code>
Pythondata analysispandasfilteringsortingdata-manipulationAggregation
Python Programming Learning Circle
Written by

Python Programming Learning Circle

A global community of Chinese Python developers offering technical articles, columns, original video tutorials, and problem sets. Topics include web full‑stack development, web scraping, data analysis, natural language processing, image processing, machine learning, automated testing, DevOps automation, and big data.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.