Comprehensive Guide to Pandas Indexing Methods: loc, iloc, Boolean Indexing, Set/Reset Index, Multi‑Index, Alignment, Sorting, Dropping, and Advanced Techniques
This article provides a comprehensive guide to Pandas indexing in Python, covering basic loc and iloc selection, Boolean indexing, setting and resetting indices, multi‑level indexing, index alignment, sorting, dropping, and advanced methods such as at, iat, and query, with complete code examples.
Pandas is a powerful Python library for data analysis, and its flexible indexing capabilities are central to efficient data manipulation. This guide introduces the most commonly used indexing methods with concrete code examples and practical use cases.
1. Basic Indexing
1.1 loc[] – label‑based indexing for selecting rows and columns.
import pandas as pd
# Create DataFrame
df = pd.DataFrame({
'A': [1, 2, 3, 4],
'B': [5, 6, 7, 8]
}, index=['a', 'b', 'c', 'd'])
print("选择单行:")
print(df.loc['a'])
print("选择多行:")
print(df.loc[['a', 'b']])
print("选择单列:")
print(df.loc[:, 'A'])
print("选择多列:")
print(df.loc[:, ['A', 'B']])
print("条件选择:")
print(df.loc[df['A'] > 2, 'B'])1.2 iloc[] – integer‑position based indexing.
# 选择单行
print("选择单行:")
print(df.iloc[0])
# 选择多行
print("选择多行:")
print(df.iloc[[0, 1]])
# 选择单列
print("选择单列:")
print(df.iloc[:, 0])
# 选择多列
print("选择多列:")
print(df.iloc[:, [0, 1]])
# 条件选择
print("条件选择:")
print(df.iloc[(df['A'] > 2).values, 1])2. Boolean Indexing – filter data using boolean conditions.
# 单条件选择
print("单条件选择:")
print(df[df['A'] > 2])
# 多条件选择
print("多条件选择:")
print(df[(df['A'] > 2) & (df['B'] < 8)])3. Setting and Resetting Index
3.1 set_index() – set one or more columns as the index.
# 设置单列索引
df_set = df.set_index('A')
print("设置单列索引后的DataFrame:")
print(df_set)
# 设置多列索引
df_set_multi = df.set_index(['A', 'B'])
print("设置多列索引后的DataFrame:")
print(df_set_multi)3.2 reset_index() – restore the default integer index.
# 重置索引
df_reset = df_set.reset_index()
print("重置索引后的DataFrame:")
print(df_reset)4. Multi‑Level Index (Hierarchical Indexing)
# 创建多级索引的DataFrame
index = pd.MultiIndex.from_tuples([('a', 'x'), ('a', 'y'), ('b', 'x'), ('b', 'y')])
df_multi = pd.DataFrame({'A': [1, 2, 3, 4], 'B': [5, 6, 7, 8]}, index=index)
print("多级索引的DataFrame:")
print(df_multi)
print("选择特定的多级索引:")
print(df_multi.loc[('a', 'x')])
print("选择多级索引的子集:")
print(df_multi.loc[('a',)])5. Index Alignment – Pandas automatically aligns indices during operations.
# 创建两个DataFrame
df1 = pd.DataFrame({'A': [1, 2, 3], 'B': [4, 5, 6]}, index=['a', 'b', 'c'])
df2 = pd.DataFrame({'A': [7, 8, 9], 'B': [10, 11, 12]}, index=['b', 'c', 'd'])
# 索引对齐后的加法
result = df1 + df2
print("索引对齐后的加法:")
print(result)6. Index Operations
6.1 sort_index() – sort the DataFrame by its index.
df_sorted = df.sort_index()
print("按索引排序后的DataFrame:")
print(df_sorted)6.2 drop() – remove rows or columns.
# 删除指定行
df_dropped_row = df.drop('a')
print("删除指定行后的DataFrame:")
print(df_dropped_row)
# 删除指定列
df_dropped_col = df.drop('A', axis=1)
print("删除指定列后的DataFrame:")
print(df_dropped_col)7. Advanced Indexing Techniques
7.1 at[] and iat[] – fast scalar access.
# 使用at[]访问单个元素
print("使用at[]访问单个元素:")
print(df.at['a', 'A'])
# 使用iat[]访问单个元素
print("使用iat[]访问单个元素:")
print(df.iat[0, 0])7.2 query() – filter using a string expression.
# 使用query()进行条件选择
print("使用query()进行条件选择:")
print(df.query('A > 2'))Conclusion
Pandas indexing offers a rich set of tools for selecting, filtering, and manipulating data. Mastering loc, iloc, Boolean indexing, set/reset index, multi‑index, alignment, sorting, dropping, and advanced methods like at, iat, and query will greatly enhance data‑processing efficiency.
Test Development Learning Exchange
Test Development Learning Exchange
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.