Information Security 10 min read

Techniques for Performing Fuzzy Search on Encrypted Data

This article examines why encrypted data is unfriendly to fuzzy queries, categorizes three implementation approaches—naïve, conventional, and advanced—and evaluates their security, performance, and storage trade‑offs while providing practical code examples and reference resources.

Java Architect Essentials
Java Architect Essentials
Java Architect Essentials
Techniques for Performing Fuzzy Search on Encrypted Data

When sensitive fields such as passwords, phone numbers, or bank details are stored encrypted, direct fuzzy searching becomes difficult; this article explores how to enable fuzzy queries on reversible encrypted data.

How to Perform Fuzzy Search on Encrypted Data

The approaches can be grouped into three categories:

Naïve methods that ignore performance considerations.

Conventional methods that balance security and query efficiency.

Advanced methods that redesign algorithms to support secure fuzzy matching.

Naïve Methods

Load all encrypted records into memory, decrypt them, and perform fuzzy matching in application code.

Create a plaintext mapping (tag) table for encrypted values and query the tag table.

These work only for very small datasets; for example, encrypting 13800138000 with DES yields a 24‑byte ciphertext HE9T75xNx6c5yLmS5l4r6Q== , which quickly exhausts memory when millions of rows are processed.

Conventional Methods

Implement decryption functions in the database and use expressions like decode(key) like '%partial%' for fuzzy matching.

Tokenize the plaintext, encrypt each token, store them in an auxiliary column, and query with key like '%partial%' .

The first variant is easy to adopt but cannot leverage indexes and may suffer from algorithm mismatches between application and database. The second variant requires extra storage for encrypted tokens but allows index usage; token length must be at least four ASCII characters or two Chinese characters to keep storage overhead reasonable.

Several e‑commerce platforms (Taobao, Alibaba, Pinduoduo, JD) use the token‑based approach.

Advanced Methods

These involve designing new encryption schemes that preserve order or enable direct ciphertext fuzzy matching, often drawing on research such as Hill cipher‑based FMES, Bloom‑filter‑enhanced searchable encryption, or Lucene‑based encrypted search.

References include:

https://www.jiamisoft.com/blog/6542-zifushujumohupipeijiamifangfa.html

http://kzyjc.cnjournals.com/html/2019/1/20190112.htm

https://www.cnblogs.com/arthurqin/p/6307153.html

Conclusion

Naïve approaches are discouraged; conventional token‑based methods (especially the second variant) offer a good trade‑off between security, performance, and implementation cost. Advanced algorithmic solutions are suitable when dedicated security experts are available.

PerformancealgorithmDatabasefuzzy searchencryptioninformation securitydata privacy
Java Architect Essentials
Written by

Java Architect Essentials

Committed to sharing quality articles and tutorials to help Java programmers progress from junior to mid-level to senior architect. We curate high-quality learning resources, interview questions, videos, and projects from across the internet to help you systematically improve your Java architecture skills. Follow and reply '1024' to get Java programming resources. Learn together, grow together.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.