Fundamentals 17 min read

Understanding Escape Characters and Regular Expressions in Python

This article explains Python's escape characters, raw strings, the fundamentals of regular expressions, their matching process, greedy versus non‑greedy quantifiers, backslash handling, and provides detailed examples of the re module's functions such as compile, match, search, split, findall, finditer, sub and subn.

Python Programming Learning Circle
Python Programming Learning Circle
Python Programming Learning Circle
Understanding Escape Characters and Regular Expressions in Python

1. Escape Characters

Regular expressions operate on strings, and Python uses the backslash \ as an escape character for special symbols. The following table lists common escape sequences such as \n for newline, \t for tab, \\ for a literal backslash, etc.

<code>\(at line end)           Continuation character
\\                      Backslash
\'                      Single quote
\"                      Double quote
\a                      Bell
\b                      Backspace
\e                      Escape
\000                    Null
\n                      Newline
\v                      Vertical tab
\t                      Horizontal tab
\r                      Carriage return
\f                      Form feed
\oyy                    Octal character (e.g., \o12 for newline)
\xyy                    Hex character (e.g., \x0a for newline)
\other                  Other characters output as literal (e.g., \w, \.)</code>

To display a string without interpreting escape sequences, prefix it with r or R to create a raw string, e.g., print(r'\t\r') outputs \t\r .

2. Understanding Regular Expressions

A regular expression is a logical formula built from predefined special characters and their combinations to create a pattern string that defines filtering logic for text.

It is a powerful tool for matching strings, widely used across programming languages, including Python, to extract desired content from text.

The matching process involves comparing each character of the pattern with the text sequentially; a match succeeds only if all characters align, otherwise it fails. Quantifiers or boundaries may alter this process.

3. Regex Syntax Rules

The following diagram (omitted) shows Python's supported regex metacharacters and syntax.

4. Regex Annotations

1) Greedy vs. Non‑greedy Quantifiers – By default, quantifiers are greedy, matching as many characters as possible; adding ? makes them non‑greedy, matching as few as possible. Non‑greedy mode is usually preferred for extraction.

2) Backslash Issues – The backslash is also the escape character in regex, which can lead to confusion. To match a literal backslash, you need four backslashes in a Python string (e.g., \\ ). Raw strings simplify this: r'\' matches a single backslash, and r'\d' matches a digit.

5. Python re Module

The built‑in re module provides regex support. Common functions include:

<code># Return a compiled pattern object
re.compile(string[, flag])
# Matching functions
re.match(pattern, string[, flags])
re.search(pattern, string[, flags])
re.split(pattern, string[, maxsplit])
re.findall(pattern, string[, flags])
re.finditer(pattern, string[, flags])
re.sub(pattern, repl, string[, count])
re.subn(pattern, repl, string[, count])</code>

Compile a pattern with re.compile(r'hello') to obtain a Pattern object, optionally passing flags such as re.I (ignore case) or re.S (dot matches newline).

6. Regex Methods Examples

re.match matches from the start of the string:

<code># Import re module
import re
pattern = re.compile(r'hello')
result1 = re.match(pattern, 'hello')
result2 = re.match(pattern, 'helloo world!')
result3 = re.match(pattern, 'helo world!')
result4 = re.match(pattern, 'hello world!')
if result1:
    print(result1.group())
else:
    print('1 match failed!')
# ... similar checks for result2‑4 ...
</code>

Output:

<code>hello
hello
3匹配失败!
hello</code>

re.search scans the entire string:

<code>import re
pattern = re.compile(r'world')
match = re.search(pattern, 'hello world!')
if match:
    print(match.group())
</code>

Output:

<code>world</code>

re.split splits by the pattern:

<code>import re
pattern = re.compile(r'\d+')
print(re.split(pattern, 'one1two2three3four4'))
</code>

Output:

<code>['one', 'two', 'three', 'four', '']</code>

re.findall returns all matches as a list:

<code>import re
pattern = re.compile(r'\d+')
print(re.findall(pattern, 'one1two2three3four4'))
</code>

Output:

<code>['1', '2', '3', '4']</code>

re.finditer yields an iterator of Match objects:

<code>import re
pattern = re.compile(r'\d+')
for m in re.finditer(pattern, 'one1two2three3four4'):
    print(m.group(), end=' ')
</code>

Output:

<code>1 2 3 4 </code>

re.sub replaces matches:

<code>import re
pattern = re.compile(r'(\w+) (\w+)')
s = 'i say, hello world!'
print(re.sub(pattern, r'\2 \1', s))

def func(m):
    return m.group(1).title() + ' ' + m.group(2).title()
print(re.sub(pattern, func, s))
</code>

Output:

<code>say i, world hello!
I Say, Hello World!</code>

re.subn returns the new string and the number of substitutions:

<code>import re
pattern = re.compile(r'(\w+) (\w+)')
s = 'i say, hello world!'
print(re.subn(pattern, r'\2 \1', s))
print(re.subn(pattern, func, s))
</code>

Output:

<code>('say i, world hello!', 2)
('I Say, Hello World!', 2)</code>

7. Alternative Calling Style

Instead of re.match(pattern, string) , you can call pattern.match(string) after compiling the pattern, applying the same methods without passing the pattern each time.

PythonregexPattern Matchingre moduleRaw StringsEscape Characters
Python Programming Learning Circle
Written by

Python Programming Learning Circle

A global community of Chinese Python developers offering technical articles, columns, original video tutorials, and problem sets. Topics include web full‑stack development, web scraping, data analysis, natural language processing, image processing, machine learning, automated testing, DevOps automation, and big data.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.