Fundamentals 6 min read

Automate Contract Management with Python: Extract, Clean, and Move Files

This article walks through a Python automation solution that extracts contract names from Excel using pandas and regex, cleans the data, and then matches and moves the corresponding contract files to a target directory, complete with code examples and results.

Python Crawling & Data Mining
Python Crawling & Data Mining
Python Crawling & Data Mining
Automate Contract Management with Python: Extract, Clean, and Move Files

1. Introduction

A fan asked for a Python automation solution to process a contract spreadsheet: extract contract names with regex, remove characters like "第" and "批", and then move matching contract files to a new folder.

2. Implementation

The first step reads the Excel file, extracts the contract name, converts Chinese numerals, and cleans unwanted characters.

import pandas as pd
import re
from chinesenumber import NumberParser
numberparse = NumberParser()

df = pd.read_excel("test.xlsx")
# Extract contract name inside parentheses
df["合同名称"] = df["合同名称"].str.extract(r"(.*?).*?((.*?))")
# Convert Chinese numbers to Arabic
df["合同名称_new1"] = df["合同名称"].apply(lambda x: numberparse.numberify(x))
# Remove characters "第" and "批"
df['合同名称_new2'] = df['合同名称_new1'].str.replace(r'(第|批)', '', regex=True)
print(df["合同名称_new2"])
df.to_excel('test1.xlsx')

The cleaned contract names are displayed in the following image:

Cleaned contract names
Cleaned contract names

The second step walks the source directory, finds files whose names contain the cleaned contract name, and copies them to the target folder.

import pandas as pd
import re
import os
import shutil

def copy_file(file_name):
    for root, dirs, files in os.walk(source_path):
        for file in files:
            if file_name in file:
                print(file)
                shutil.copyfile(root + '\\' + file, target_path + '\\' + file)
                print(root + '\\' + file + ' copied -> ' + target_path)

if __name__ == '__main__':
    source_path = r'xxx\source_path'
    target_path = r'C:\Users\Desktop\res'
    df = pd.read_excel("test1.xlsx")
    df["合同名称_new2"] = df["合同名称_new2"].apply(lambda x: copy_file(x))
    print("over!")

After execution, the files are successfully moved as shown in the image below.

File moving result
File moving result

3. Summary

The tutorial demonstrates a complete Python workflow for extracting, cleaning, and relocating contract files, providing reusable code snippets for similar data‑processing automation tasks.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Pythondata-processing
Python Crawling & Data Mining
Written by

Python Crawling & Data Mining

Life's short, I code in Python. This channel shares Python web crawling, data mining, analysis, processing, visualization, automated testing, DevOps, big data, AI, cloud computing, machine learning tools, resources, news, technical articles, tutorial videos and learning materials. Join us!

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.