iOS WeChat Full-Text Search Technology Upgrade: Selection and Optimization
iOS WeChat’s full‑text search was upgraded by selecting SQLite FTS5, creating a VerbatimTokenizer with multi‑level delimiter support, optimizing table formats to cut index size by 30 %, improving batch index updates and parallel search logic, resulting in 40‑60 % faster query latency.
This article details the comprehensive upgrade of iOS WeChat's full-text search technology. The content covers four main areas: search engine selection and optimization, database table format optimization, index update logic optimization, and search logic optimization.
1. Search Engine Selection and Optimization
The team compared SQLite FTS3/FTS5 and Lucene, ultimately selecting SQLite FTS5 due to its superior transaction capabilities, lower technical risk, comprehensive search syntax, and acceptable read/write performance. They implemented FTS5's automatic segment merge mechanism in WCDB, moving merge operations to a separate background thread to avoid impacting business performance.
2. Tokenizer Optimization
A new tokenizer called VerbatimTokenizer was developed, using basic character-based tokenization without redundant index content. It supports five extension capabilities: traditional-to-simplified Chinese conversion, Unicode normalization, symbol filtering, Porter Stemming for English, and case-insensitive matching.
3. Index Content with Multi-Level Delimiters
A new FTS5 auxiliary function called SubstringMatchInfo was developed to support multi-level delimiters in index content, enabling more precise search result matching within specific attribute boundaries.
4. Database Table Format Optimization
The optimization includes storing non-searchable content directly in the FTS index table, using UNINDEXED constraints to avoid redundant indexing, and placing searchable content in the first column to reduce index size. This resulted in approximately 30% reduction in index file size.
5. Index Update Logic Optimization
The solution ensures index-data consistency through progress tracking using rowid for chat records, updateSequence for favorites, and flag marking for contacts. Batch indexing (100 records per batch) improves efficiency while maintaining search result completeness through in-memory caching.
6. Search Logic Optimization
Parallel execution within single search tasks is achieved through table splitting for large datasets and parallel independent search logic for complex searches. A CancelFlag mechanism enables search task interruption. Content reading is minimized by only retrieving business IDs and sorting attributes during search, with on-demand content loading for display.
After optimization, search latency improved by 40-60% across chat records, contacts, and favorites searches, while index file size decreased by approximately 30%.
Tencent Cloud Developer
Official Tencent Cloud community account that brings together developers, shares practical tech insights, and fosters an influential tech exchange community.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.