Fundamentals 5 min read

How to Extract Exact 6‑Digit Codes from Mixed Text Using Python Regex

This article walks through extracting six‑digit numeric codes from strings containing letters, numbers, and symbols using Python regular expressions and pandas, presenting multiple community‑sourced approaches, complete code examples, and visual explanations to help readers solve similar data extraction challenges.

Python Crawling & Data Mining
Python Crawling & Data Mining
Python Crawling & Data Mining
How to Extract Exact 6‑Digit Codes from Mixed Text Using Python Regex

1. Introduction

The author shares a question from a Python community about extracting a continuous six‑digit number from a column that mixes letters, digits, and case‑sensitive characters, while ignoring sequences longer or shorter than six digits.

2. Implementation

Several contributors propose solutions. The first approach lists all possible patterns per row and merges the results. An illustrative diagram is shown below.

A more concrete code snippet is provided:

def extract_digits(my_list):
    target_digits = []
    for item in my_list:
        for i in range(len(item)):
            if item[i:i+6].isdigit() and len(item[i:i+6]) == 6:
                target_digits.append(item[i:i+6])
    return target_digits

my_list = ['abc123', '123456', 'xyz789', '9876543', '12qw345', '12345678']
target_digits = extract_digits(my_list)
print(target_digits)

The snippet works but still needs refinement.

Another contributor suggests using regular expressions with pandas for CSV data:

df = pd.read_csv('示例.csv', encoding='gbk')
pattern = r'\D(\d{6})(?=\D|$)'
df['提取单号'] = df['理由'].map(lambda x: re.findall(pattern, x)[0] if len(re.findall(pattern, x)) >= 1 else 0)
print(df)

The expected output is displayed in the following image.

Although the solution works, a small issue remains for future discussion.

3. Conclusion

The article summarizes the Python regex data‑extraction problem, presents multiple community‑derived solutions with code, and helps readers apply these techniques to their own datasets.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

PythonregexpandasString processingCode Tutorial
Python Crawling & Data Mining
Written by

Python Crawling & Data Mining

Life's short, I code in Python. This channel shares Python web crawling, data mining, analysis, processing, visualization, automated testing, DevOps, big data, AI, cloud computing, machine learning tools, resources, news, technical articles, tutorial videos and learning materials. Join us!

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.