Master Excel‑Pandas Integration: From Data Import to Visualization in Python
This tutorial demonstrates how to combine Excel’s interactive features with Python’s Pandas library to perform comprehensive data operations—including reading, generating, filtering, sorting, handling missing values, deduplication, merging, grouping, calculation, statistics, visualization, sampling, pivot tables, and VLOOKUP—showing when each tool excels.
Preface
Excel and Python are both common tools for data analysis. This article shows how to use dynamic Excel visuals together with Python code to perform data reading, generation, calculation, modification, statistics, sampling, searching, visualization, and storage.
Data Reading
Explanation: Read local Excel data.
In Excel, open the target folder, select the file, and open it.
In Pandas, you can read local Excel or txt files, or directly read tables from the web with a single line of code, e.g.:
pd.read_excel("示例数据.xlsx")Data Generation
Explanation: Generate data with specified format/quantity.
To generate a 10×2 matrix of uniformly distributed random numbers (0‑1) in Excel, use the rand() function and manually set the range.
In Pandas, combine NumPy to generate the same matrix with one line:
pd.DataFrame(np.random.rand(10,2))Data Storage
Explanation: Store table data locally.
In Excel, click Save and set the format/file name.
In Pandas, use: pd.to_excel("filename.xlsx") to save the current worksheet, or to_csv for CSV or other formats, optionally specifying an absolute path.
Data Filtering
Explanation: Filter data according to specified criteria.
In Excel, filter the example data for salaries greater than 5000.
In Pandas, use conditional filtering: df[df['薪资水平']>5000] for a single condition, or combine multiple conditions with & (and) and | (or).
Data Insertion
Explanation: Insert specified data at a given position.
In Excel, place the cursor, right‑click to add a row/column, and optionally use the IF function, e.g. =IF(G2>10000,"高","低") to label high/low salaries and add a column.
In Pandas, use cut or other methods to achieve similar insertion.
Data Deletion
Explanation: Delete specified rows/columns/cells.
In Excel, right‑click the target cell/row/column and choose Delete.
In Pandas, delete a column with del df['new_col'] or remove rows using indexing.
Data Sorting
Explanation: Sort data according to specified requirements.
In Excel, click the Sort button and sort salaries from high to low.
In Pandas, use
df.sort_values("薪资水平", ascending=False, inplace=True).
Missing Value Handling
Explanation: Process missing (null) values according to requirements.
In Excel, use Find → Go To → Special → Blanks, then fill using the previous value or other methods.
In Pandas, check missing values with data.isnull().sum() and fill them, e.g.:
df = df.fillna(axis=0, method='ffill')Data Deduplication
Explanation: Remove duplicate values according to requirements.
In Excel, use Data → Remove Duplicates and select columns, e.g., deduplicate by creation time.
In Pandas, use df.drop_duplicates(['创建时间'], inplace=True) to achieve the same result.
Format Modification
Explanation: Change the format of specified data.
In Excel, select cells → Right‑click → Format Cells to choose the desired format.
In Pandas, use
df['创建时间'] = df['创建时间'].dt.strftime('%Y-%m-%d')to reformat dates.
Data Exchange (VLOOKUP)
Explanation: Use VLOOKUP to find data.
In Excel, VLOOKUP is a core function for lookup operations.
In Pandas, there is no direct VLOOKUP function; you can achieve the same by merging dataframes. First read the table, split it into two dataframes, set appropriate indexes, and then use update (or merge) to perform the lookup.
Conclusion
The tutorial shows how Pandas can replicate most common Excel operations through code. While Excel excels at interactive tasks like pivot tables, Pandas shines in grouping, calculations, and integration with other Python libraries such as NumPy, making it a powerful choice for data processing when the appropriate tool is selected.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Python Crawling & Data Mining
Life's short, I code in Python. This channel shares Python web crawling, data mining, analysis, processing, visualization, automated testing, DevOps, big data, AI, cloud computing, machine learning tools, resources, news, technical articles, tutorial videos and learning materials. Join us!
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
