Common Regular Expressions and Their Syntax
This article introduces the fundamentals of regular expressions, explains their components, syntax rules such as quantifiers and anchors, and provides a collection of frequently used regex patterns for tasks like email validation, ID numbers, dates, IP addresses, and URLs.
In the era of big data, testing often involves large volumes of data, and regular expressions are essential for quickly locating and replacing target data; this article compiles commonly used regex patterns.
1. Composition and Use
Regular expressions consist of ordinary characters (e.g., a‑z) and special characters called meta‑characters, primarily used for string matching, searching substrings, or performing replacements.
2. Syntax
Regex characters include ordinary characters, non‑printable characters, special characters, quantifiers, and anchors. The focus here is on quantifiers and anchors.
Quantifiers
* : matches the preceding expression zero or more times, e.g., ok* matches ok and okkk .
+ : matches the preceding sub‑expression one or more times, e.g., ok+ matches ok , okkk but not o .
? : matches zero or one occurrence of the preceding sub‑expression (non‑greedy), e.g., go(od)? matches go and good .
{n} : matches exactly n times, e.g., o{2} matches good but not not .
{n,} : matches n or more times, e.g., o{2,} matches all o in gooooood .
{n,m} : matches between n and m times, e.g., o{1,3} matches the first three o in gooooood .
Anchors
^ : matches the start of a string, e.g., ^test matches strings beginning with test .
$ : matches the end of a string, e.g., test$ matches strings ending with test .
\b : matches a word boundary, e.g., \bGoo matches the Goo in Good .
\B : matches a non‑boundary position, e.g., \BGoo does not match Good but matches wowGood .
Note: (^[0-9])+ is equivalent to \d+ (one or more digits) and [^[0-9]]+ is equivalent to \D+ (non‑digit characters).
3. Common Regular Expressions
1. Alphanumeric and underscore validation: [A-Za-z0-9_] (equivalent to \w+ )
2. Chinese characters: [\u4e00-\u9fa5]
3. Email address: ^\w+([-+.]\w+)*@\w+([-.]\w+)*\.\w+([-.]\w+)*$
4. 18‑digit ID number: ^[1-9]\d{5}(18|19|([23]\d))\d{2}((0[1-9])|(10|11|12))(([0-2][1-9])|10|20|30|31)\d{3}[0-9Xx]$
5. Date format: ^\d{4}-\d{1,2}-\d{1,2}
6. IP address: ((?:(?:25[0-5]|2[0-4]\d|[01]?\d?\d)\.){3}(?:25[0-5]|2[0-4]\d|[01]?\d?\d))
7. Internet URL: ^http://([\w-]+\.)+[\w-]+(/[\w-./?%&=]*)?$
360 Quality & Efficiency
360 Quality & Efficiency focuses on seamlessly integrating quality and efficiency in R&D, sharing 360’s internal best practices with industry peers to foster collaboration among Chinese enterprises and drive greater efficiency value.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.