Master Python Regex: Extract Any Date Format with Simple Patterns
This article walks through building and refining a Python regular expression that reliably matches a wide range of Chinese date formats—including year‑month‑day, year/month/day, and year‑month—explaining each component, testing against sample strings, and presenting the final pattern that captures all cases.
In this tutorial we demonstrate how to build a Python regular expression that can match various Chinese date formats such as “2018年6月7日”, “2018/6/7”, “2018-6-7”, “2018-06-07”, “2018-06” and “2018”.
We start with a simple pattern and iteratively refine it, explaining each component:
Use .* to match any preceding characters.
Match the literal text “高考时间是”.
Match the four‑digit year with \d{4} and allow a separator [年/-].
Match the month with \d{1,2} followed by [月/-].
Match the day with \d{1,2} and use | and $ to handle optional day.
After defining the pattern we test it against several example strings (string2‑string6). The first five strings match successfully; the sixth fails because the month is followed by no separator. By adjusting the pattern to make the month separator optional or by adding an alternative group, the pattern matches all six cases.
The final regular expression looks like:
.*高考时间是\s*\d{4}[年/-]\d{1,2}[月/-]\d{1,2}日|.*高考时间是\s*\d{4}[年/-]\d{1,2}[月/-]$All six date variations are now correctly captured, demonstrating the power of Python’s regex engine.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Python Crawling & Data Mining
Life's short, I code in Python. This channel shares Python web crawling, data mining, analysis, processing, visualization, automated testing, DevOps, big data, AI, cloud computing, machine learning tools, resources, news, technical articles, tutorial videos and learning materials. Join us!
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
