Fundamentals 5 min read
Using Pandas concat and merge to Combine Multiple Datasets
This tutorial demonstrates how to use Pandas' concat and merge functions to combine multiple DataFrames by rows, columns, custom indexes, and multiple keys, providing clear code examples for inner, left, right, and outer joins.
Test Development Learning Exchange
Test Development Learning Exchange
Objective: Learn how to use Pandas to merge multiple datasets.
Learning content includes the concat method and the merge method.
Code examples:
import pandas as pd
# create first dataset
data1 = {'姓名': ['张三', '李四', '王五'],
'部门': ['销售部', '市场部', '技术部'],
'销售额': [120, 150, 130]}
df1 = pd.DataFrame(data1)
print(f"第一个数据集: \n{df1}")
# create second dataset
data2 = {'姓名': ['赵六', '孙七', '周八'],
'部门': ['财务部', '人力资源部', '销售部'],
'销售额': [140, 160, 170]}
df2 = pd.DataFrame(data2)
print(f"第二个数据集: \n{df2}")
# concatenate rows
df_concat_rows = pd.concat([df1, df2])
print(f"按行合并后的数据集: \n{df_concat_rows}")
# concatenate columns with different columns
data3 = {'姓名': ['张三', '李四', '王五'],
'成本': [80, 90, 100]}
df3 = pd.DataFrame(data3)
print(f"第三个数据集: \n{df3}")
df_concat_cols = pd.concat([df1, df3], axis=1)
print(f"按列合并后的数据集: \n{df_concat_cols}")
# concatenate with ignore_index
df_concat_index = pd.concat([df1, df2], ignore_index=True)
print(f"按行合并并指定索引后的数据集: \n{df_concat_index}")
# merge inner
df_merge_inner = pd.merge(df1, df2, on='姓名', how='inner')
print(f"内连接后的数据集: \n{df_merge_inner}")
# merge left
df_merge_left = pd.merge(df1, df2, on='姓名', how='left')
print(f"左连接后的数据集: \n{df_merge_left}")
# merge right
df_merge_right = pd.merge(df1, df2, on='姓名', how='right')
print(f"右连接后的数据集: \n{df_merge_right}")
# merge outer
df_merge_outer = pd.merge(df1, df2, on='姓名', how='outer')
print(f"外连接后的数据集: \n{df_merge_outer}")
# merge with multiple keys
data4 = {'姓名': ['张三', '李四', '王五'],
'部门': ['销售部', '市场部', '技术部'],
'成本': [80, 90, 100]}
df4 = pd.DataFrame(data4)
print(f"第四个数据集: \n{df4}")
df_merge_multi_key = pd.merge(df1, df4, on=['姓名', '部门'], how='inner')
print(f"指定多个键合并后的数据集: \n{df_merge_multi_key}")Practice repeats the same steps to combine two datasets that share identical columns.
Summary: After this exercise you should be able to merge datasets with Pandas using both concat and merge , handling row‑wise, column‑wise, index handling, and multi‑key joins.
Written by
Test Development Learning Exchange
Test Development Learning Exchange
0 followers
Reader feedback
How this landed with the community
Rate this article
Was this worth your time?
Discussion
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.