Excel Data Comparison Tool Using Python Tkinter and Pandas
This article introduces a Python-based Excel data comparison tool that uses Tkinter for a graphical interface and Pandas for data handling, detailing its features, step-by-step code explanation, full source code, and tips for ensuring unique row insertion during comparison.
This guide presents a desktop application built with Python that allows users to compare data across columns in an Excel file. The tool provides a simple graphical interface powered by Tkinter, lets users select files and columns, automatically detects mismatches, and displays the differing rows clearly.
Tool Features Overview
File selection with a browse dialog.
Column selection for the comparison baseline.
Automatic detection and display of rows where values differ from the selected column.
Result view showing full row content with highlighted mismatches.
Code Explanation
The script starts by importing required libraries: tkinter and its extensions for the GUI, pandas for Excel data handling, and various Tkinter utilities for dialogs and styling.
import tkinter as tk
from tkinter import ttk, filedialog, messagebox, font
import pandas as pdA main window is created and titled "Excel数据对比工具".
# 主窗口配置
root = tk.Tk()
root.title("Excel数据对比工具")Key variables are defined to store the file path, the DataFrame, the selected column, and a set to track inserted row indices.
# 变量定义
file_path = tk.StringVar()
df = None
column_var = tk.StringVar(root)
column_var.set("请选择列")
inserted_indices = set()The select_file function lets the user choose an Excel file, reads it with pandas.read_excel , and updates the column menu and Treeview.
# 文件选择函数
def select_file():
global df
path = filedialog.askopenfilename(filetypes=[("Excel Files", "*.xlsx")])
file_path.set(path)
try:
df = pd.read_excel(path, engine='openpyxl')
update_column_menu()
update_treeview_columns()
except Exception as e:
messagebox.showerror("错误", f"读取文件时发生错误:{e}")Helper functions update_column_menu and update_treeview_columns keep the UI in sync with the loaded DataFrame.
# 列选择菜单更新函数
def update_column_menu():
if df is not None:
menu = columns_menu['menu']
menu.delete(0, 'end')
for col in df.columns:
menu.add_command(label=col, command=lambda col=col: column_var.set(col))
# Treeview列更新函数
def update_treeview_columns():
global df
if df is not None:
result_tree["columns"] = df.columns.tolist()
for col in list(result_tree["columns"]):
if col not in df.columns:
result_tree.heading(col, text="")
result_tree.column(col, width=0)
for col in df.columns:
result_tree.heading(col, text=col)
max_str = df[col].astype(str).replace('nan', '').max()
max_width = my_font.measure(max_str) + 10
result_tree.column(col, width=max_width, anchor="center")The core compare_columns function clears previous results, iterates over all columns except the target, finds mismatched rows, and inserts each unique row into the Treeview with a distinct tag for styling.
# 数据对比函数
def compare_columns():
global df
result_tree.delete(*result_tree.get_children())
inserted_indices.clear()
target_column = column_var.get()
if df is None or target_column not in df.columns:
messagebox.showwarning("警告", "请先选择文件并选择一个列。")
return
update_treeview_columns()
for col in df.columns:
if col != target_column:
mismatches = df[df[target_column] != df[col]]
if not mismatches.empty:
for index, row in mismatches.iterrows():
if index not in inserted_indices:
values = tuple(row)
result_tree.insert("", "end", values=values, tags=('mismatch',))
inserted_indices.add(index)
style = ttk.Style()
style.configure('Treeview', rowheight=30, font=('Arial', 12))
style.map('Treeview', background=[('selected', 'blue')], foreground=[('selected', 'white')])
style.configure('mismatch.Treeview.Item', background='lightcoral')The GUI layout includes labels, entry fields, buttons for browsing and starting the comparison, and a Treeview with a scrollbar to display results.
# GUI元素创建
tk.Label(root, text="选择Excel文件:").pack()
tk.Entry(root, textvariable=file_path).pack()
tk.Button(root, text="浏览", command=select_file).pack()
result_frame = ttk.Frame(root)
result_frame.pack(fill=tk.BOTH, expand=True)
scrollbar = ttk.Scrollbar(result_frame)
scrollbar.pack(side=tk.RIGHT, fill=tk.Y)
result_tree = ttk.Treeview(result_frame, yscrollcommand=scrollbar.set, show="headings")
result_tree.pack(side=tk.LEFT, fill=tk.BOTH, expand=True)
scrollbar.config(command=result_tree.yview)
my_font = font.Font(family='Helvetica', size=12)
columns_menu = tk.OptionMenu(root, column_var, "")
columns_menu.pack()
tk.Button(root, text="开始对比", command=compare_columns).pack()
root.mainloop()Tips
Each row is inserted only once to avoid duplicate entries.
Using pandas DataFrames enables efficient reading and comparison of Excel data.
Test Development Learning Exchange
Test Development Learning Exchange
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.