Optimizing Small‑Scale Product Title Search with MySQL FULLTEXT and Ngram Indexing
This article explains how to efficiently search product titles in a few‑million‑row MySQL table by using built‑in FULLTEXT indexes, jieba‑php tokenization, and the ngram parser, avoiding unnecessary external search engines and providing practical code examples.
When querying product titles in a relatively small dataset (e.g., a few million rows), many developers instinctively reach for Elasticsearch or XunSearch, but MySQL’s native FULLTEXT index can handle the workload without extra infrastructure.
The author demonstrates a workflow that uses the jieba-php library to tokenize Chinese titles, stores the tokens in a separate product_title table linked to product , and creates a FULLTEXT index on the title column.
Example of initializing jieba and tokenizing a title:
Jieba::init();
Finalseg::init();
$tokens = Jieba::cut($model->title);MySQL can create a FULLTEXT index directly:
CREATE TABLE test (
id INT AUTO_INCREMENT PRIMARY KEY,
title TEXT,
FULLTEXT(title)
);To combine LIKE and FULLTEXT search, the following query can be used:
$searchTerm = '关键词';
$results = YourModel::where('title', 'like', "%{$searchTerm}%")
->orWhereRaw("MATCH(title) AGAINST(? IN NATURAL LANGUAGE MODE)", [$searchTerm])
->get();Because MySQL’s default tokenization does not handle Chinese, the ngram parser is required for multilingual support. Adding an ngram FULLTEXT index looks like this:
ALTER TABLE 表名 ADD FULLTEXT(title) WITH PARSER ngram;After tokenization, searching for a term such as “饼干” will quickly locate the corresponding document IDs via the index.
For compound queries like “饼干包装”, the author suggests storing each token separately and using a BOOLEAN MODE query:
SELECT * FROM test_fulltext WHERE MATCH(title) AGAINST('饼干+包装' IN BOOLEAN MODE);Additional performance tips include adjusting MySQL’s index buffer size and avoiding in‑memory middlewares when the index resides on disk.
The article concludes with a humorous note that most developers will never need such complex setups, but the presented solution is a lightweight alternative to heavyweight search stacks.
Selected Java Interview Questions
A professional Java tech channel sharing common knowledge to help developers fill gaps. Follow us!
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.