Tag

Aho-Corasick

0 views collected around this technical thread.

JD Tech Talk
JD Tech Talk
May 21, 2025 · Fundamentals

Aho-Corasick Automaton: Efficient Multi‑Pattern Text Search and Real‑Time Highlighting

This article explains the Aho‑Corasick automaton, a classic multi‑pattern matching algorithm that builds a Trie with fail pointers to achieve linear‑time search over massive keyword sets, and demonstrates a Java implementation for highlighting keywords in HTML documents.

Aho-CorasickCode ExampleJava
0 likes · 10 min read
Aho-Corasick Automaton: Efficient Multi‑Pattern Text Search and Real‑Time Highlighting
JD Tech Talk
JD Tech Talk
Apr 8, 2025 · Fundamentals

Performance Comparison of String Replacement Algorithms in Java

The article analyzes various Java string‑replacement techniques—including simple String.replace, compiled regular expressions, Aho‑Corasick automaton, and custom Trie implementations—by presenting their designs, object sizes, and benchmark results to guide developers in choosing the most efficient solution for large keyword sets.

Aho-CorasickJavaTrie
0 likes · 13 min read
Performance Comparison of String Replacement Algorithms in Java
Sohu Tech Products
Sohu Tech Products
Nov 22, 2023 · Backend Development

Optimizing a Real‑Time Keyword Matching Service with Aho‑Corasick and Double‑Array Trie

By replacing the naïve double‑loop matcher with a Double‑Array Trie‑based Aho‑Corasick automaton and refactoring the system into a layered name‑and‑data microservice architecture that shards the keyword dictionary and rebuilds the automaton only on version changes, the real‑time keyword‑matching service reduced latency from seconds to milliseconds even at thousands of QPS.

Aho-CorasickJavaMicroservices
0 likes · 17 min read
Optimizing a Real‑Time Keyword Matching Service with Aho‑Corasick and Double‑Array Trie
Architecture Digest
Architecture Digest
Jul 8, 2022 · Fundamentals

Sensitive Word Matching in Vivo's Content Review System: Algorithm Selection and Practical Implementations

The article describes how Vivo's content moderation platform, DiTing, uses algorithm selection—including Aho‑Corasick automaton, combination word matching, and pinyin‑based matching—to efficiently detect sensitive terms in large‑scale text streams, while addressing challenges such as homophones, multi‑character patterns, and performance constraints.

Aho-CorasickPinyin MatchingSensitive Word Matching
0 likes · 14 min read
Sensitive Word Matching in Vivo's Content Review System: Algorithm Selection and Practical Implementations
HomeTech
HomeTech
Oct 14, 2020 · Fundamentals

Dynamic Hyperlink Insertion in Automotive Articles Using HanLP Aho‑Corasick Double‑Array Trie

This article describes a Java‑based solution that dynamically adds brand and model hyperlinks to automotive articles by building multilingual keyword dictionaries with HanLP and employing an Aho‑Corasick Double‑Array Trie for efficient, context‑aware matching without altering the original content.

Aho-CorasickDouble-Array TrieDynamic Linking
0 likes · 12 min read
Dynamic Hyperlink Insertion in Automotive Articles Using HanLP Aho‑Corasick Double‑Array Trie