A Beginner's Guide to Regular Expressions: Basics, Advanced Features, and Practical Code Samples
This article provides a clear, step‑by‑step tutorial on regular expressions—covering fundamental metacharacters, quantifiers, grouping, alternation, character classes, assertions, capturing, back‑references, greedy vs. lazy matching, and includes runnable code snippets in Java and JavaScript for common validation tasks.
Regular expressions (regex) are supported by virtually every programming language, from front‑end JavaScript to back‑end Java and C#. Despite their ubiquity, many learners never receive formal instruction, often encountering cryptic patterns that replace lengthy if‑else checks.
This guide presents the essential concepts in plain language, enabling readers to write and understand simple regex patterns.
1. Metacharacters
Metacharacters are the building blocks of a regex. Common ones include:
Metacharacter
Description . Matches any character except a newline \w Matches letters, digits, underscore, or Chinese characters \s Matches any whitespace character \d Matches a digit \b Matches a word boundary ^ Matches the start of a string $ Matches the end of a string
Using these, simple patterns can be built, for example: \babc // matches strings starting with "abc" ^\d{8}$ // matches an 8‑digit QQ number ^1\d{10}$ // matches an 11‑digit mobile number starting with 1
2. Quantifiers (Repetition Operators)
Quantifiers control how many times a preceding token may repeat:
Syntax
Description * Zero or more times + One or more times ? Zero or one time {n} Exactly n times {n,} At least n times {n,m} Between n and m times
Applying them refines earlier examples, e.g. ^\d{8}$ becomes ^\d{8}$ (exactly eight digits) or ^\d{14,18}$ for a 14‑to‑18 digit bank card number.
3. Grouping
Parentheses ( ) create a group, allowing the quantified token to apply to the whole sub‑pattern. For instance, ^(ab)* matches zero or more repetitions of "ab".
4. Escaping
Special characters lose their meta meaning when preceded by a backslash. To match literal parentheses, use \( and \), e.g. ^(\(ab\))* matches "(ab)(ab)".
5. Alternation (OR)
The pipe | provides a logical OR. A practical example for Chinese mobile carriers is:
^(130|131|132|155|156|185|186|145|176)\d{8}$6. Character Classes (Ranges)
Square brackets define a set or range: [0-9] matches any digit, [A-Z] any uppercase letter, and [165] matches 1, 6, or 5.
7. Assertions (Zero‑Width)
Assertions check surrounding context without consuming characters.
Positive look‑ahead: (?=pattern) Positive look‑behind: (?<=pattern) Negative look‑ahead: (?!pattern) Negative look‑behind: (?<!pattern) Example: extracting a read count from HTML – "\\d+(?=</span>)" captures the number before </span>.
8. Capturing and Non‑Capturing Groups
Capturing groups store matched substrings for later use (e.g., (\d{8})). Named groups use (?<name>...). Non‑capturing groups (?:...) group without storing.
9. Back‑References
Inside the same pattern, \1, \2, … refer to previously captured groups. Example to find repeated characters: (\w)\1 matches "aa", "bb", etc.
10. Greedy vs. Lazy Matching
Quantifiers are greedy by default, consuming as much as possible. Adding ? makes them lazy (minimal). Example: (\d{1,2}?)(\d{3,4}) matches the smallest possible left part and the largest possible right part.
11. Negation
Negated character classes and shortcuts ( \W, \S, \D, \B) match anything *except* the specified set.
Regular expressions are powerful but require practice. This guide equips beginners with the core syntax and demonstrates how to apply it in Java and JavaScript for common validation and extraction tasks.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Java Captain
Focused on Java technologies: SSM, the Spring ecosystem, microservices, MySQL, MyCat, clustering, distributed systems, middleware, Linux, networking, multithreading; occasionally covers DevOps tools like Jenkins, Nexus, Docker, ELK; shares practical tech insights and is dedicated to full‑stack Java development.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
