Tag

tokenizer

0 views collected around this technical thread.

NetEase Cloud Music Tech Team
NetEase Cloud Music Tech Team
Apr 15, 2024 · Mobile Development

Implementation and Optimization of Local Private Domain Search in Cloud Music

The Cloud Music team integrated a lightweight on‑device full‑text engine using SQLite FTS5 with a simple tokenizer, replaced JavaScript matching with SQLite’s bm25(), parallelized queries, and cut search latency by 75%, boosting CTR 13% and average playback by 17 seconds while preserving user privacy.

FTS5Full-Text SearchPerformance Optimization
0 likes · 15 min read
Implementation and Optimization of Local Private Domain Search in Cloud Music
政采云技术
政采云技术
Dec 19, 2023 · Backend Development

Principles and Simple Implementation of a Search Engine in Go

This article explains the fundamental concepts of search engine technology—including forward and inverted indexes, tokenizers, stop words, synonym handling, ranking algorithms, and NLP integration—and provides a concise Go implementation with code examples and performance testing.

GoInverted IndexNLP
0 likes · 21 min read
Principles and Simple Implementation of a Search Engine in Go
Tencent Cloud Developer
Tencent Cloud Developer
Feb 20, 2023 · Mobile Development

iOS WeChat Full-Text Search Technology Upgrade: Selection and Optimization

iOS WeChat’s full‑text search was upgraded by selecting SQLite FTS5, creating a VerbatimTokenizer with multi‑level delimiter support, optimizing table formats to cut index size by 30 %, improving batch index updates and parallel search logic, resulting in 40‑60 % faster query latency.

Database OptimizationFull-Text SearchIndex Optimization
0 likes · 26 min read
iOS WeChat Full-Text Search Technology Upgrade: Selection and Optimization
Efficient Ops
Efficient Ops
Jun 23, 2021 · Backend Development

Why Can’t Elasticsearch Find My Logs? Uncovering Full‑Text Search Pitfalls and Tokenizer Tweaks

This article explains why large‑scale Elasticsearch clusters may miss log entries during keyword searches, dives into the fundamentals of inverted indexes and tokenization, and demonstrates practical index‑time and query‑time tokenizer optimizations—including custom analyzers for English and Chinese—to dramatically improve search recall and precision.

ElasticsearchFull-Text SearchInverted Index
0 likes · 13 min read
Why Can’t Elasticsearch Find My Logs? Uncovering Full‑Text Search Pitfalls and Tokenizer Tweaks
System Architect Go
System Architect Go
Sep 3, 2018 · Fundamentals

Understanding Elasticsearch Analyzer, Tokenizer, and Token Filters

This article explains the core components of Elasticsearch's full‑text search analysis—Analyzers, Tokenizers, and Token Filters—detailing their roles, building blocks, built‑in types, and how they combine to customize text processing for effective indexing and querying.

AnalyzerElasticsearchFull-Text Search
0 likes · 5 min read
Understanding Elasticsearch Analyzer, Tokenizer, and Token Filters