Why Go’s Regex Is 25× Slower Than Python – And When It Actually Wins

A detailed benchmark shows Go’s regexp engine is about 25 times slower than Python for a matching input, but in worst‑case scenarios Go remains microseconds while Python can take seconds, thanks to Go’s linear‑time Thompson NFA design versus Python’s exponential backtracking engine.

Radish, Keep Going!
Radish, Keep Going!
Radish, Keep Going!
Why Go’s Regex Is 25× Slower Than Python – And When It Actually Wins

Measurements on the same machine compare Go’s regexp engine with Python’s re for the pattern (a+)+b. With an input of 1000 a characters followed by b, Go needs 11,292 ns per operation, while Python finishes in 443.6 ns – roughly 25× faster for this best‑case match.

Benchmark details

Using Go 1.23.4 on an Intel Xeon Platinum (linux/amd64):

// Go 1.23.4, Intel Xeon Platinum, linux/amd64
strings.Contains(s, "b")    27.71 ns/op   ← string search baseline
regexp `b`                    90.20 ns/op   ← simple regex, 3.3× slower
regexp `(a+)+b`              11,292 ns/op   ← complex regex, 408× slower

Python 3.12 on the same hardware:

// Python 3.12
Complex regex (a+)+b: 443.6 ns/op   ← 25× faster than Go’s regexp

The Go numbers look bad, but they reflect a best‑case scenario where the engine stops as soon as it finds a match at the end of the input.

Two regex engine families

All regex engines fall into two categories:

Thompson NFA – used by Go. It simulates all possible states simultaneously, guaranteeing O(m·n) time (m = pattern length, n = input length) with strict linear performance.

Backtracking – used by Python, Perl, Java, Ruby, PCRE, etc. It tries one path, backtracks on failure, and can explode to O(2ⁿ) in the worst case, though average cases are fast.

Go chooses the former, sacrificing raw speed for predictable worst‑case behavior.

Worst‑case input

When the input contains no match (e.g., a string of only a characters), the performance gap widens dramatically:

// pattern: (a+)+b, input: pure "a"×n, no match → worst case
// Same machine, Python 3.12 measurements
input length n | Go time   | Python time | ratio
10             | 1.9 µs    | 96 µs       | 50×
20             | 2.3 µs    | 67,698 µs   | 29,000×
25             | 2.9 µs    | 2,169,774 µs| 750,000×
29             | 3.0 µs    | 35,110,024 µs| 11,700,000×

For 29 characters, Go needs about 3 µs while Python takes 35 seconds – a 11.7 million‑fold difference, illustrating a classic ReDoS (Regular Expression Denial of Service) scenario.

Cloudflare suffered a real‑world outage in July 2019 when a backtracking regex in its WAF caused global CPU saturation, dropping 80 % of traffic for 27 minutes.

"A leader in our Solutions Engineering group told me we had lost 80% of our traffic." – Cloudflare post‑mortem, July 2 2019

Go’s three internal engines

The regexp package actually contains three engines that are chosen automatically: onepass – fastest path for unambiguous simple patterns; single‑pass scan. backtrack – medium path for moderately complex patterns that stay within safe backtrack limits. NFA – full Thompson NFA simulation used as a fallback for complex or unsafe patterns; guarantees linear time.

Benchmarks show that for a simple pattern like a+b (onepass) and the complex pattern (a+)+b (NFA) on a 100‑character input, the timings are almost identical (≈1.2 µs), suggesting the three‑engine design does not affect typical fast cases.

The real benefit is safety: the NFA path never exceeds O(m·n) , preventing exponential blow‑up even for pathological inputs.

Practical recommendations

Use plain string functions for fixed‑string matching instead of regex.

// ❌ Unnecessary regex
matched, _ := regexp.MatchString(`^https://`, url)
// ✅ Faster and clearer
matched := strings.HasPrefix(url, "https://")

In a test, a simple email‑validation regex took 312 ns, while a hand‑written string check took 7.4 ns – a 42× speedup.

Avoid recompiling the same pattern on every call. Compile once at package level.

// ❌ Recompiles each call – slower and allocates memory
func IsValid(s string) bool {
    matched, _ := regexp.MatchString(`^\d{4}-\d{2}-\d{2}$`, s)
    return matched
}
// ✅ Compile once – faster and zero allocations
var dateRe = regexp.MustCompile(`^\d{4}-\d{2}-\d{2}$`)
func IsValid(s string) bool { return dateRe.MatchString(s) }

If a regex becomes a proven bottleneck, consider the go-re2 library, which offers a drop‑in API with 5–10× speed improvements for complex patterns while preserving linear time. go get github.com/wasilibs/go-re2 For user‑provided patterns, stick with the standard library to avoid exposing PCRE‑style features that can be abused.

Scenario checklist

Fixed‑string matching → use strings.Contains or bytes.Index.

User‑supplied pattern → use the standard regexp (safe by default).

High‑frequency calls with a proven bottleneck → profile and consider go-re2.

Need PCRE features (backreferences, etc.) → use regexp2 but understand ReDoS risks.

Other cases → compile once and keep it simple.

Conclusion

Go’s regexp engine is 3–408× slower than plain string operations, and 25× slower than Python for a matching input, but it remains microseconds even for pathological inputs where Python can become millions of times slower. The trade‑off is a linear‑time Thompson NFA that guarantees safety at the cost of average speed. In practice, compile regexes once, prefer string functions when possible, and only reach for go-re2 when profiling shows a genuine performance problem.

GobenchmarkregexReDoS
Radish, Keep Going!
Written by

Radish, Keep Going!

Personal sharing

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.