Databases 9 min read

Mastering Search SQL: Advanced Full‑Text Queries with CONTAINS, NEAR, and FUZZY

This tutorial demonstrates how to create a table with a Chinese analyzer, insert multilingual records, and leverage Search SQL's CONTAINS, NEAR, and FUZZY operators to perform precise, high‑performance full‑text searches beyond simple LIKE patterns.

StarRing Big Data Open Lab
StarRing Big Data Open Lab
StarRing Big Data Open Lab
Mastering Search SQL: Advanced Full‑Text Queries with CONTAINS, NEAR, and FUZZY

Creating a Table with a Chinese Analyzer

To store Chinese text for semantic search, Max creates an internal table with a column that uses the ZH analyzer (IK tokenizer). The DDL is:

create table news_analyze_zh(
  key1 string,
  content string append analyzer 'ZH' 'ik'
) stored as ES
with shard number 10
replication 1;

The append clause enables both standard SQL fuzzy matching and semantic search, while ik is a Chinese word‑segmentation tokenizer.

Inserting Chinese Text

After the table is created, Max inserts several news headlines related to the company:

insert into news_analyze_zh(key1, content) values ('1', '全国人大财经委莅临星环科技');
insert into news_analyze_zh(key1, content) values ('2', '星环信息科技(上海)有限公司');
insert into news_analyze_zh(key1, content) values ('3', '星环科技荣获2017电信大数据司马奖“优秀成果”奖');
insert into news_analyze_zh(key1, content) values ('4', '星环前沿科技论坛九城巡展顺利落下帷幕');
insert into news_analyze_zh(key1, content) values ('5', '腾讯领投星环科技2.35亿元');
insert into news_analyze_zh(key1, content) values ('6', '明星环卫工的坚守');
insert into news_analyze_zh(key1, content) values ('7', '星环科技荣列大数据企业50强');

Analyzing with LIKE vs. CONTAINS

Using a traditional LIKE query: select * from news_analyze_zh where content like '%星环%'; returns all rows containing the characters “星环”, including row 6 where the phrase is part of “明星环卫工”, which is irrelevant to the company.

Switching to the semantic CONTAINS operator eliminates this false positive:

select * from news_analyze_zh where contains(content, '星环');

The result set only includes rows where “星环” appears as an independent token, demonstrating higher precision and faster execution compared with LIKE.

Using the NEAR Operator

To find texts where the words “电信” (telecom) and “数据” (data) appear close together, Max uses the NEAR operator with a distance of 1:

select * from news_analyze_zh where contains(content, 'near((电信,数据),1,false)');

This returns only the records where the two terms occur within one token of each other, satisfying the requirement for contextual relevance.

Using the FUZZY Operator

For flexible matching of the phrase “星环科技”, including variations like “星环…科技” or “科技…星环”, Max applies the FUZZY operator:

select * from news_analyze_zh where contains(content, 'fuzzy(星环科技,2)');

The query returns rows containing the exact phrase as well as those where the two words appear within a distance of two tokens, enabling broader yet controlled retrieval.

Conclusion

Search SQL extends traditional SQL with powerful full‑text capabilities. By leveraging CONTAINS together with advanced operators such as NEAR and FUZZY, users can achieve precise, high‑performance semantic searches on unstructured Chinese data, reduce migration effort, and simplify implementation compared with custom API solutions.

LIKE query result
LIKE query result
CONTAINS query result
CONTAINS query result
NEAR query result
NEAR query result
FUZZY query result
FUZZY query result
Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

full-text searchCONTAINSFUZZYNEARSearch SQLChinese Analyzer
StarRing Big Data Open Lab
Written by

StarRing Big Data Open Lab

Focused on big data technology research, exploring the Big Data era | [email protected]

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.