Fundamentals 5 min read

How to Split and Explode Multiple Columns in Pandas to Flatten Excel Data

This article walks through a real‑world Pandas problem where product descriptions and quantities stored as comma‑separated strings in Excel are split and exploded into separate rows, explains why NaN values appear, and shows how converting columns to string resolves the issue.

Python Crawling & Data Mining
Python Crawling & Data Mining
Python Crawling & Data Mining
How to Split and Explode Multiple Columns in Pandas to Flatten Excel Data

1. Introduction

Hello everyone, I’m PiPi. A fan asked a Pandas data‑processing question: the Excel file contains a "商品内容" (product description) column and a "数量" (quantity) column where each cell holds multiple values separated by commas, and they need to be expanded into separate rows.

Below is the original code they tried:

jigou_df = pd.read_excel(jigou_path)
data = {'商品内容':jigou_df['商品内容'], '数量':jigou_df['数量']}
df = pd.DataFrame(data)
df_expanded = df.assign(数量=df['数量'].astype(str).str.split(',')).explode('数量')
df_expanded.reset_index(drop=True, inplace=True)
df_expanded

The result was not as expected.

2. Solution Process

A hint was given: first split both columns, then explode them.

Using

df = df.explode(['商品内容', '数量']).reset_index(drop=True)

works, but the following complete implementation was provided:

import pandas as pd

# Read the Excel file
df = pd.read_excel("机构订单_202401091514_1.0.xlsx")

# Split the "商品内容" column
df['商品内容'] = df['商品内容'].str.split('、')
# Split the "数量" column
df['数量'] = df['数量'].str.split('、')

# Explode both columns
result = df.explode(['商品内容', '数量']).reset_index(drop=True)
print(result)

The output shows the data correctly expanded, but the last two rows appear as nan. This happens because those cells are numeric in the original Excel file, and after splitting they become nan.

Fix: convert the "数量" column to string before splitting:

df['数量'] = df['数量'].astype("string").str.split('、')

Alternatively, specify dtype=str when reading the file, or use astype or a lambda conversion during map.

3. Summary

The article demonstrates how to handle a common Pandas issue: splitting multi‑value cells and exploding them into separate rows, and why datatype mismatches can produce nan values. Converting columns to string before splitting resolves the problem.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

data cleaningsplitexplodedtype
Python Crawling & Data Mining
Written by

Python Crawling & Data Mining

Life's short, I code in Python. This channel shares Python web crawling, data mining, analysis, processing, visualization, automated testing, DevOps, big data, AI, cloud computing, machine learning tools, resources, news, technical articles, tutorial videos and learning materials. Join us!

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.