Fundamentals 4 min read

How to Simplify Image Filename Deduplication in Python

This article walks through a practical Python example for deduplicating image file names, compares an initial verbose implementation with a more concise solution, and demonstrates how to reduce redundant conditional checks for cleaner, more readable code.

Python Crawling & Data Mining
Python Crawling & Data Mining
Python Crawling & Data Mining
How to Simplify Image Filename Deduplication in Python

1. Introduction

Hello everyone, I’m PiPi. While helping a follower with a simple request, I discovered a useful Python case worth sharing.

2. Requirement Clarification

The follower needed a Python script to deduplicate image file names, as shown in the screenshot below.

3. Initial Implementation

The original code used multiple conditional branches to handle different separators and extensions:

material_picture_code = []
list3 = ['J0.jpg', 'J1.png', 'J1-2.png', 'J20.png', 'J36.png', 'J5_01.jpg', 'J5_02.png']
for file in list3:
    if '-' in file:
        duplicate_material_picture = file.split('-')[0]
        material_picture_code.append(duplicate_material_picture)
    elif '_' in file:
        duplicate_material_picture = file.split('_')[0]
        material_picture_code.append(duplicate_material_picture)
    elif file.endswith('.png'):
        material_picture_code.append(file.split('.png')[0])
    else:
        material_picture_code.append(file.split('.jpg')[0])

print(material_picture_code)

Although it produced the expected result, the code felt redundant due to the three separate checks.

4. Optimized Solution

Guided by a peer, the logic was simplified by extracting the base name before any separator or extension, dramatically reducing code size and improving readability:

5. Conclusion

This short Python case demonstrates an efficient way to handle file name deduplication without multiple conditional branches, making the script more maintainable and easier to understand.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

PythonCode Optimizationdeduplicationfile-handling
Python Crawling & Data Mining
Written by

Python Crawling & Data Mining

Life's short, I code in Python. This channel shares Python web crawling, data mining, analysis, processing, visualization, automated testing, DevOps, big data, AI, cloud computing, machine learning tools, resources, news, technical articles, tutorial videos and learning materials. Join us!

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.