Tag

multi-pattern matching

0 views collected around this technical thread.

JD Tech Talk
JD Tech Talk
May 21, 2025 · Fundamentals

Aho-Corasick Automaton: Efficient Multi‑Pattern Text Search and Real‑Time Highlighting

This article explains the Aho‑Corasick automaton, a classic multi‑pattern matching algorithm that builds a Trie with fail pointers to achieve linear‑time search over massive keyword sets, and demonstrates a Java implementation for highlighting keywords in HTML documents.

Aho-CorasickCode ExampleJava
0 likes · 10 min read
Aho-Corasick Automaton: Efficient Multi‑Pattern Text Search and Real‑Time Highlighting
Architecture Digest
Architecture Digest
Jul 8, 2022 · Fundamentals

Sensitive Word Matching in Vivo's Content Review System: Algorithm Selection and Practical Implementations

The article describes how Vivo's content moderation platform, DiTing, uses algorithm selection—including Aho‑Corasick automaton, combination word matching, and pinyin‑based matching—to efficiently detect sensitive terms in large‑scale text streams, while addressing challenges such as homophones, multi‑character patterns, and performance constraints.

Aho-CorasickPinyin MatchingSensitive Word Matching
0 likes · 14 min read
Sensitive Word Matching in Vivo's Content Review System: Algorithm Selection and Practical Implementations