Mastering Regular Expressions: Numbers, Emails, Phone Numbers & HTML
This tutorial walks you through building and testing regular expressions for common patterns such as floating‑point numbers, optional fractions, exponents, phone numbers, email addresses, and simple HTML tags, explaining each step and showing practical examples.
Regular expressions provide a powerful tool for text processing and parsing across most programming languages and operating systems.
We start by reviewing basic concepts and then demonstrate how to construct regex patterns for common use cases.
Floating‑point Numbers
Begin with a simple pattern for integers: \d+ matches numbers without a decimal part or sign.
To include a fractional part, add \.\d+: \d+\.\d+ Making the fractional part optional yields: \d+(\.\d+)? Allowing the integer part to be optional as well leads to: (\d+)?(\.\d*)? To avoid matching an empty string, enforce that at least one part is present: (\d+(\.\d*)?|\d+\.\d+) Finally, add an optional exponent:
(\d+(\.\d*)?|\d+\.\d+)(e[-+]?\d+)?Phone Numbers
A basic pattern \d+ only matches pure digits, missing common separators.
A more realistic pattern handles optional separators and area codes: (\d{3})[-. ]?(\d{3})[-. ]?(\d{4}) To support parentheses around the area code and optional country code: \(?\d{3}\)?[-. ]?(\d{3})[-. ]?(\d{4}) Further allowing a leading plus sign and optional country code:
(\+?\d+)?[\(-.]?(\d{3})[\)-. ]?(\d{3})[-. ]?(\d{4})Email Addresses
A simple email pattern: \w+@\w+\.\w+ To allow plus and minus signs in the local part: [-+\w]+@[-+\w]+\.[-+\w]+ To support multiple domain levels: [-+\w]+@[-+\w]+\.[-+\w]+(\.[-+\w]+)? Allowing a dot in the local part as well:
[-+\.\w]+@[-+\w]+\.[-+\w]+(\.[-+\w]+)?HTML Parsing with Regex
Basic tag matching without attributes: <(\w+)>[^<]*</\1> Allowing attributes inside the opening tag: <(\w+)[^>]*>[^<]*</\1> For a more permissive match that captures any content between matching tags (excluding nested tags):
<(\w+)[^>]*>.*</\1>Conclusion
The article introduced common regex use cases—including floating‑point numbers, phone numbers, email addresses, and simple HTML tags—emphasizing how to craft patterns rather than merely copying ready‑made expressions.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
21CTO
21CTO (21CTO.com) offers developers community, training, and services, making it your go‑to learning and service platform.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
