Fundamentals 13 min read

Which String Replacement Method Is Fastest? A Java Performance Comparison

This article examines various Java string‑replacement techniques—including simple replace, regex, Aho‑Corasick, and custom Trie implementations—by presenting their design, code samples, and detailed performance benchmarks to help developers choose the most efficient solution for large keyword sets.

JD Cloud Developers
JD Cloud Developers
JD Cloud Developers
Which String Replacement Method Is Fastest? A Java Performance Comparison

Background

The task is to replace a set of keywords in product names, which can involve thousands of keywords and cause CPU overload when using naive String.replace calls.

Because the problem is essentially the same as sensitive‑word filtering, multi‑pattern string‑matching algorithms such as Aho‑Corasick (AC automaton) are relevant.

Solution Overview

2.1 Simple String.replace Approach

Direct replacement works for a few keywords but scales poorly.

public interface Replacer {
    String replaceKeywords(String text);
}
public class StrReplacer implements Replacer {
    private final List<String> keyWordList;
    public StrReplacer(String keyWords) {
        this.keyWordList = Lists.newArrayList(keyWords.split(";"));
        keyWordList.sort((a, b) -> Integer.compare(b.length(), a.length()));
    }
    @Override
    public String replaceKeywords(String text) {
        String newTxt = text;
        for (String s : keyWordList) {
            newTxt = newTxt.replace(s, "");
        }
        return newTxt;
    }
}

2.2 Regex‑Based Replacement

Compiling a literal pattern for each keyword yields better performance than raw String.replace.

public String replace(CharSequence target, CharSequence replacement) {
    return Pattern.compile(target.toString(), Pattern.LITERAL)
                  .matcher(this)
                  .replaceAll(Matcher.quoteReplacement(replacement.toString()));
}

2.3 Aho‑Corasick (AC Automaton) Implementation

Using the existing org.ahocorasick library provides high‑speed multi‑pattern matching.

<dependency>
    <groupId>org.ahocorasick</groupId>
    <artifactId>ahocorasick</artifactId>
    <version>0.6.3</version>
</dependency>
public class AhoCorasickReplacer implements Replacer {
    private final Trie trie;
    public AhoCorasickReplacer(String keyWords) {
        Trie.TrieBuilder builder = Trie.builder().ignoreOverlaps().onlyWholeWords();
        for (String s : keyWords.split(";")) {
            builder.addKeyword(s);
        }
        this.trie = builder.build();
    }
    @Override
    public String replaceKeywords(String text) {
        Collection<Emit> emits = trie.parseText(text);
        StringBuilder result = new StringBuilder();
        int lastEnd = 0;
        for (Emit emit : emits) {
            int start = emit.getStart();
            int end = emit.getEnd();
            if (start > lastEnd) {
                result.append(text, lastEnd, start);
            }
            lastEnd = end + 1;
        }
        if (lastEnd <= text.length() - 1) {
            result.append(text.substring(lastEnd));
        }
        return result.toString();
    }
}

2.4 Custom Trie Implementation

A hand‑rolled Trie focuses solely on replacement, offering the best raw speed.

public class TrieKeywordReplacer implements Replacer {
    private final Trie trie;
    @Override
    public String replaceKeywords(String text) {
        return trie.replaceKeywords(text, "");
    }
    public TrieKeywordReplacer(String keyWords) {
        Trie trie = new Trie();
        for (String s : keyWords.split(";")) {
            trie.insert(s);
        }
        this.trie = trie;
    }
    // Trie and TrieNode classes omitted for brevity
}

Performance Comparison

Benchmarks (JDK 1.8, 400 keywords) show average execution times (nanoseconds) for single‑threaded loops of 10 000 iterations:

StrReplacer: ~21 843 ns

PatternReplacer: ~28 846 ns

TrieKeywordReplacer: ~532 ns

AhoCorasickReplacer: ~727 ns

Under concurrent load (2 threads) the custom Trie remains fastest, while Aho‑Corasick stays competitive and provides additional features such as match positions and counts.

Final Recommendations

1. For production use, the ready‑made Aho‑Corasick library offers the best balance of performance and stability; a custom Trie can be used when only simple replacement is needed.

2. In real scenarios, most product names contain only a few keywords. Pre‑filtering with a Bloom filter or a quick substring check (e.g., looking for the character “补”) can avoid unnecessary full scans.

3. Example of a pre‑filter wrapper:

public String replaceKeywords(String skuName) {
    Replacer replacer = new AhoCorasickReplacer(keyWords);
    if (skuName.contains("补")) {
        return replacer.replaceKeywords(skuName);
    } else {
        return skuName;
    }
}

Illustrations

Trie structure illustration
Trie structure illustration
Additional diagram
Additional diagram
Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

algorithmJava performanceAho-CorasickTriestring-replacement
JD Cloud Developers
Written by

JD Cloud Developers

JD Cloud Developers (Developer of JD Technology) is a JD Technology Group platform offering technical sharing and communication for AI, cloud computing, IoT and related developers. It publishes JD product technical information, industry content, and tech event news. Embrace technology and partner with developers to envision the future.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.