Tagged articles
29 articles
Page 1 of 1
JavaGuide
JavaGuide
Apr 7, 2026 · Information Security

Why Brute‑Force Won’t Cut It for Sensitive‑Word Filtering (And What Actually Works)

The article walks through the evolution of sensitive‑word filtering—from naïve brute‑force scanning to Trie, Aho‑Corasick automaton, Double‑Array Trie, and DFA implementations—detailing their algorithms, time/space complexities, concrete Java code examples, performance trade‑offs, high‑concurrency optimizations, and practical production advice for building a robust content‑moderation system.

Aho-CorasickDFADouble-Array Trie
0 likes · 26 min read
Why Brute‑Force Won’t Cut It for Sensitive‑Word Filtering (And What Actually Works)
Code Wrench
Code Wrench
Mar 3, 2026 · Artificial Intelligence

Unlocking High‑Performance Chinese Segmentation: Inside Go’s gse Library

This article deeply examines the source code of Go’s high‑performance segmentation library gse, revealing its Double‑Array Trie, shortest‑path dynamic programming, and HMM‑Viterbi implementations, and demonstrates practical usage for Chinese tokenization, part‑of‑speech tagging, keyword extraction, and custom dictionary management.

GoHMMNLP
0 likes · 13 min read
Unlocking High‑Performance Chinese Segmentation: Inside Go’s gse Library
dbaplus Community
dbaplus Community
Jan 8, 2026 · Backend Development

How Big Platforms Verify Username Availability in Milliseconds

This article walks through the layered architecture that large services like Instagram use to instantly check if a username is taken, starting from simple database queries, adding caching, employing Bloom filters, and finally using Trie structures for fast, memory‑efficient lookups.

Backend ArchitectureScalabilityTrie
0 likes · 10 min read
How Big Platforms Verify Username Availability in Milliseconds
dbaplus Community
dbaplus Community
Jan 2, 2026 · Information Security

How We Built a High‑Performance, Low‑Cost Content Moderation System with Trie + Aho‑Corasick

Faced with minutes‑long posting delays and exploding review costs in a fast‑growing social app, the team introduced 24‑hour shift staffing, a local blacklist stored in MySQL, an in‑memory Trie + Aho‑Corasick matcher, Redis‑driven hot updates and a machine‑audit fallback with a feedback loop, dramatically cutting latency, cost and false‑positives.

Aho-CorasickGoTrie
0 likes · 33 min read
How We Built a High‑Performance, Low‑Cost Content Moderation System with Trie + Aho‑Corasick
Java Companion
Java Companion
Dec 18, 2025 · Backend Development

Building a High‑Performance Sensitive‑Word Filter with SpringBoot and DFA

This article explains why traditional string‑search and regex methods struggle with large keyword sets, introduces the deterministic finite automaton (DFA) approach using a Trie structure for linear‑time matching, provides full Java implementations, and discusses real‑world applications and advanced optimizations such as double‑array Tries, Aho‑Corasick, and sharding with Bloom filters.

DFAJavaSensitiveWordFilter
0 likes · 17 min read
Building a High‑Performance Sensitive‑Word Filter with SpringBoot and DFA
Code Ape Tech Column
Code Ape Tech Column
Nov 20, 2025 · Backend Development

Build a Millisecond‑Scale Sensitive Word Filter with DFA and Trie in Java

This article explains why traditional string matching and regex struggle with large keyword sets, introduces a DFA‑based solution using a Trie tree for linear‑time detection, provides full Java implementations, shows real‑world integration scenarios, and explores advanced optimizations such as double‑array tries, Aho‑Corasick automata, and sharding with Bloom filters.

DFAJavaSensitive Word Filtering
0 likes · 17 min read
Build a Millisecond‑Scale Sensitive Word Filter with DFA and Trie in Java
Rare Earth Juejin Tech Community
Rare Earth Juejin Tech Community
Oct 12, 2025 · Backend Development

Building a High‑Performance Content Moderation System with Trie, Aho‑Corasick, Redis, and Go

This article details how to design and implement a scalable, low‑cost content moderation pipeline that combines a local Trie + Aho‑Corasick engine, Redis‑based hot‑updates, MySQL persistence, and third‑party machine‑review fallback to achieve millisecond‑level response, high accuracy, and controllable costs.

Aho-CorasickBackendGo
0 likes · 34 min read
Building a High‑Performance Content Moderation System with Trie, Aho‑Corasick, Redis, and Go
Tech Freedom Circle
Tech Freedom Circle
May 28, 2025 · Backend Development

Designing a 100k QPS Sensitive‑Word Filter with Real‑Time Updates

This article analyzes high‑throughput sensitive‑word filtering by comparing brute‑force, KMP, Trie, double‑array Trie and Aho‑Corasick algorithms, presents their time and space complexities, shows Java implementations for Trie and AC automata, evaluates Netty deployment options, and offers practical optimizations such as asynchronous detection, hot‑reloading, tiered responses, logging and fuzzy matching.

Aho-CorasickAlgorithm OptimizationJava
0 likes · 37 min read
Designing a 100k QPS Sensitive‑Word Filter with Real‑Time Updates
JD Tech Talk
JD Tech Talk
Apr 8, 2025 · Fundamentals

Performance Comparison of String Replacement Algorithms in Java

The article analyzes various Java string‑replacement techniques—including simple String.replace, compiled regular expressions, Aho‑Corasick automaton, and custom Trie implementations—by presenting their designs, object sizes, and benchmark results to guide developers in choosing the most efficient solution for large keyword sets.

Aho-CorasickJavaTrie
0 likes · 13 min read
Performance Comparison of String Replacement Algorithms in Java
JD Cloud Developers
JD Cloud Developers
Apr 8, 2025 · Fundamentals

Which String Replacement Method Is Fastest? A Java Performance Comparison

This article examines various Java string‑replacement techniques—including simple replace, regex, Aho‑Corasick, and custom Trie implementations—by presenting their design, code samples, and detailed performance benchmarks to help developers choose the most efficient solution for large keyword sets.

Aho-CorasickJava performanceTrie
0 likes · 13 min read
Which String Replacement Method Is Fastest? A Java Performance Comparison
Sohu Tech Products
Sohu Tech Products
Nov 22, 2023 · Backend Development

Optimizing a Real‑Time Keyword Matching Service with Aho‑Corasick and Double‑Array Trie

By replacing the naïve double‑loop matcher with a Double‑Array Trie‑based Aho‑Corasick automaton and refactoring the system into a layered name‑and‑data microservice architecture that shards the keyword dictionary and rebuilds the automaton only on version changes, the real‑time keyword‑matching service reduced latency from seconds to milliseconds even at thousands of QPS.

Aho-CorasickJavaMicroservices
0 likes · 17 min read
Optimizing a Real‑Time Keyword Matching Service with Aho‑Corasick and Double‑Array Trie
ELab Team
ELab Team
Nov 11, 2022 · Backend Development

Boost Node.js Routing Performance with Trie Prefix Trees

This article explains how to implement an efficient routing system for Node.js web frameworks using a Trie (prefix tree) data structure, covering static, dynamic, and regex route matching, code examples, performance considerations, and practical tips for optimizing route lookup.

Backend DevelopmentNode.jsPrefix Tree
0 likes · 13 min read
Boost Node.js Routing Performance with Trie Prefix Trees
MaGe Linux Operations
MaGe Linux Operations
Oct 29, 2022 · Backend Development

How to Build a Go Trie for Real‑Time Sensitive Word Filtering

This article demonstrates how to implement a sensitive‑word detection system in Go using a prefix‑tree (Trie), covering brute‑force, regex, and optimized rune‑based methods, plus special‑character filtering, pinyin support, and complete source code examples.

GoText FilteringTrie
0 likes · 19 min read
How to Build a Go Trie for Real‑Time Sensitive Word Filtering
Xiao Lou's Tech Notes
Xiao Lou's Tech Notes
Aug 31, 2022 · Fundamentals

Why My Simple Go Map Solution Timed Out and How I Fixed It

After struggling with a seemingly easy scoring problem in a regional programming contest, the author details multiple Go implementations—including a map, a 27‑base array, and a trie—examines their time and memory issues, discovers input handling pitfalls, and ultimately achieves an accepted solution.

Triealgorithmcompetitive programming
0 likes · 14 min read
Why My Simple Go Map Solution Timed Out and How I Fixed It
JD Tech
JD Tech
Apr 1, 2022 · Fundamentals

Advanced Matching Algorithms and Graph Data Structures: KMP, Rabin‑Karp, Boyer‑Moore, Trie, Double‑Array Trie, and AC Automaton

This article introduces common graph concepts and several advanced string‑matching algorithms—including Brute‑Force, Rabin‑Karp, KMP, Boyer‑Moore, AC automaton, Trie, and Double‑Array Trie—explaining their principles, implementations, complexity analyses, and typical application scenarios for search systems.

AlgorithmsKMPTrie
0 likes · 20 min read
Advanced Matching Algorithms and Graph Data Structures: KMP, Rabin‑Karp, Boyer‑Moore, Trie, Double‑Array Trie, and AC Automaton
Baidu Geek Talk
Baidu Geek Talk
Jul 19, 2021 · Backend Development

How Baidu Scales Sensitive Word Detection to Tens of Millions with a Trie‑Based Service

This article explains the design and evolution of Baidu's word‑list service for content moderation, covering its background, multi‑layer architecture, management platform, strategy loading, matching workflow, performance optimizations for large texts, and future enhancements such as special‑character support and per‑business‑line deployment.

BOSBackend ArchitectureElasticsearch
0 likes · 16 min read
How Baidu Scales Sensitive Word Detection to Tens of Millions with a Trie‑Based Service
Java Captain
Java Captain
May 13, 2019 · Fundamentals

Implementing Sensitive Word Filtering with Trie Trees

This article explains how to use a trie (prefix tree) to efficiently filter sensitive words in a text, covering the basic concepts, construction steps, traversal algorithm, complexity analysis, and a Java implementation using HashMap.

Data StructureJavaSensitive Word Filtering
0 likes · 9 min read
Implementing Sensitive Word Filtering with Trie Trees
High Availability Architecture
High Availability Architecture
Mar 14, 2019 · Databases

SlimTrie: A Space‑Efficient Trie‑Based Index for Large‑Scale Storage Systems

This article presents SlimTrie, a trie‑based indexing structure that dramatically reduces memory consumption while maintaining fast query speeds, detailing its design, compression techniques, implementation steps, memory analysis, and performance comparisons with map and B‑Tree structures for large‑scale storage systems.

GoMemory OptimizationSlimTrie
0 likes · 20 min read
SlimTrie: A Space‑Efficient Trie‑Based Index for Large‑Scale Storage Systems
MaGe Linux Operations
MaGe Linux Operations
Jan 13, 2018 · Artificial Intelligence

How FlashText Cuts Keyword Search from Days to Minutes

FlashText is an open‑source Python library that dramatically speeds up keyword search and replacement in large text corpora, turning multi‑day regex operations into a fifteen‑minute task by leveraging the Aho‑Corasick algorithm and a Trie‑based dictionary.

Aho-CorasickFlashTextPython
0 likes · 8 min read
How FlashText Cuts Keyword Search from Days to Minutes
21CTO
21CTO
Aug 5, 2017 · Backend Development

How I Reduced Log Keyword Counting from Hours to Minutes Using PHP, Grep, Regex & Trie

This article walks through solving a massive log‑keyword counting task—600,000 short messages and 50,000 keywords—by evolving from a simple grep‑based approach to regex optimizations, word‑splitting, a trie data structure, and finally a multi‑process Redis queue, achieving a performance boost from hours to under ten minutes.

GrepLog ProcessingTrie
0 likes · 15 min read
How I Reduced Log Keyword Counting from Hours to Minutes Using PHP, Grep, Regex & Trie
Nightwalker Tech
Nightwalker Tech
Mar 2, 2017 · Information Security

Techniques and Tools for Anti‑Spam Content Filtering in PHP

The discussion outlines practical anti‑spam strategies—including text length limits, keyword replacement, trie‑based data structures, AC automata, Bayesian and vector‑similarity algorithms, and PHP extensions such as libdatrie—while also sharing performance metrics and resource links for implementing robust content filtering systems.

PHPTriecontent filtering
0 likes · 4 min read
Techniques and Tools for Anti‑Spam Content Filtering in PHP
dbaplus Community
dbaplus Community
Jun 19, 2016 · Backend Development

Beyond Cache+Hash: Real Strategies for Building High‑Concurrency Systems

This article demystifies the common belief that cache‑plus‑hash alone solves high‑concurrency challenges, explores essential techniques such as static resource serving, read‑write separation, advanced caching, hash‑based sharding, and especially the design trade‑offs of various Trie‑based data structures for search‑suggestion services, and offers practical optimization steps.

Backend ArchitectureHashingTrie
0 likes · 28 min read
Beyond Cache+Hash: Real Strategies for Building High‑Concurrency Systems