Fundamentals 13 min read

Master Excel‑Pandas Integration: From Data Import to Visualization in Python

This tutorial demonstrates how to combine Excel’s interactive features with Python’s Pandas library to perform comprehensive data operations—including reading, generating, filtering, sorting, handling missing values, deduplication, merging, grouping, calculation, statistics, visualization, sampling, pivot tables, and VLOOKUP—showing when each tool excels.

Python Crawling & Data Mining

Jun 30, 2020

Master Excel‑Pandas Integration: From Data Import to Visualization in Python

Preface

Excel and Python are both common tools for data analysis. This article shows how to use dynamic Excel visuals together with Python code to perform data reading, generation, calculation, modification, statistics, sampling, searching, visualization, and storage.

Data Reading

Explanation: Read local Excel data.

In Excel, open the target folder, select the file, and open it.

In Pandas, you can read local Excel or txt files, or directly read tables from the web with a single line of code, e.g.:

pd.read_excel("示例数据.xlsx")

Data Generation

Explanation: Generate data with specified format/quantity.

To generate a 10×2 matrix of uniformly distributed random numbers (0‑1) in Excel, use the rand() function and manually set the range.

In Pandas, combine NumPy to generate the same matrix with one line:

pd.DataFrame(np.random.rand(10,2))

Data Storage

Explanation: Store table data locally.

In Excel, click Save and set the format/file name.

In Pandas, use: pd.to_excel("filename.xlsx") to save the current worksheet, or to_csv for CSV or other formats, optionally specifying an absolute path.

Data Filtering

Explanation: Filter data according to specified criteria.

In Excel, filter the example data for salaries greater than 5000.

In Pandas, use conditional filtering: df[df['薪资水平']>5000] for a single condition, or combine multiple conditions with & (and) and | (or).

Data Insertion

Explanation: Insert specified data at a given position.

In Excel, place the cursor, right‑click to add a row/column, and optionally use the IF function, e.g. =IF(G2>10000,"高","低") to label high/low salaries and add a column.

In Pandas, use cut or other methods to achieve similar insertion.

Data Deletion

Explanation: Delete specified rows/columns/cells.

In Excel, right‑click the target cell/row/column and choose Delete.

In Pandas, delete a column with del df['new_col'] or remove rows using indexing.

Data Sorting

Explanation: Sort data according to specified requirements.

In Excel, click the Sort button and sort salaries from high to low.

In Pandas, use

df.sort_values("薪资水平", ascending=False, inplace=True)

Missing Value Handling

Explanation: Process missing (null) values according to requirements.

In Excel, use Find → Go To → Special → Blanks, then fill using the previous value or other methods.

In Pandas, check missing values with data.isnull().sum() and fill them, e.g.:

df = df.fillna(axis=0, method='ffill')

Data Deduplication

Explanation: Remove duplicate values according to requirements.

In Excel, use Data → Remove Duplicates and select columns, e.g., deduplicate by creation time.

In Pandas, use df.drop_duplicates(['创建时间'], inplace=True) to achieve the same result.

Format Modification

Explanation: Change the format of specified data.

In Excel, select cells → Right‑click → Format Cells to choose the desired format.

In Pandas, use

df['创建时间'] = df['创建时间'].dt.strftime('%Y-%m-%d')

to reformat dates.

Data Exchange (VLOOKUP)

Explanation: Use VLOOKUP to find data.

In Excel, VLOOKUP is a core function for lookup operations.

In Pandas, there is no direct VLOOKUP function; you can achieve the same by merging dataframes. First read the table, split it into two dataframes, set appropriate indexes, and then use update (or merge) to perform the lookup.

Conclusion

The tutorial shows how Pandas can replicate most common Excel operations through code. While Excel excels at interactive tasks like pivot tables, Pandas shines in grouping, calculations, and integration with other Python libraries such as NumPy, making it a powerful choice for data processing when the appropriate tool is selected.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Python data processing Excel Pandas

Written by

Python Crawling & Data Mining

Life's short, I code in Python. This channel shares Python web crawling, data mining, analysis, processing, visualization, automated testing, DevOps, big data, AI, cloud computing, machine learning tools, resources, news, technical articles, tutorial videos and learning materials. Join us!

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.