How to Perform Fuzzy Search on Encrypted Phone Numbers in MySQL
This article explores multiple strategies for enabling fuzzy search on encrypted phone numbers, including in‑memory caching, database decryption functions, segment‑wise storage, and dedicated fuzzy‑search fields, comparing their trade‑offs in performance, memory usage, and data consistency.
Preface
When personal privacy data such as passwords, ID numbers, bank cards, or phone numbers is encrypted for security, performing fuzzy queries on the encrypted values becomes challenging because the ciphertexts differ significantly.
1 Load Data into Memory
A simple approach is to load all encrypted privacy data into memory, decrypt it, and then perform fuzzy matching in code. This is easy and low‑cost but can cause OOM errors and data‑consistency problems when the data volume grows or when updates are not synchronized across caches.
2 Use Database Functions
MySQL provides encryption/decryption functions (e.g.,
DES_ENCRYPTand
DES_DECRYPT) that can be used to decrypt ciphertexts on the fly. While this avoids inconsistencies, each fuzzy search requires decrypting every row, leading to poor performance on large datasets.
<code>SELECT DES_DECRYPT('U2FsdGVkX1+q7g9npbydGL1HXzaZZ6uYYtXyug83jHA=', '123');</code>3 Segment Storage
Split the original string into smaller segments, encrypt each segment, and store them in a mapping table. For a phone number, split every three digits, encrypt each part, and insert the segments together with a reference to the main record.
<code>CREATE TABLE `encrypt_value_mapping` (
`id` bigint NOT NULL COMMENT 'system id',
`ref_id` bigint NOT NULL COMMENT 'related system id',
`encrypt_value` varchar(255) NOT NULL COMMENT 'encrypted string'
) ENGINE=InnoDB CHARSET=utf8mb4 COLLATE=utf8mb4_bin COMMENT='segment encryption mapping table';
</code>When searching, query the mapping table for matching encrypted segments, retrieve the corresponding reference IDs, and then fetch the full user records.
<code>select s2.id,s2.name,s2.phone
from encrypt_value_mapping s1
inner join `user` s2 on s1.ref_id=s2.id
where s1.encrypt_value = 'U2FsdGVkX19Se8cEpSLVGTkLw/yiNhcB'
limit 0,20;
</code>4 Extend to Other Fields
The same segment‑mapping technique can be extended to other privacy fields such as ID numbers or bank cards by adding a
typecolumn to distinguish the data type. However, storing many segments per record can cause the mapping table to grow dramatically and degrade query performance.
5 Add Fuzzy Search Column
A more practical solution for large datasets is to add an extra column (e.g.,
encrypt_phone) to the main user table that stores the concatenated encrypted segments separated by a delimiter. This column can be queried directly with a
LIKEclause.
<code>CREATE TABLE `user` (
`id` int NOT NULL,
`code` varchar(20) NOT NULL,
`age` int NOT NULL DEFAULT '0',
`name` varchar(30) NOT NULL,
`height` int NOT NULL DEFAULT '0',
`address` varchar(30) DEFAULT NULL,
`phone` varchar(11) DEFAULT NULL,
`encrypt_phone` varchar(255) DEFAULT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB CHARSET=utf8mb4 COLLATE=utf8mb4_bin COMMENT='user table';
</code> <code>select id,name,phone
from user
where encrypt_phone like '%U2FsdGVkX19Se8cEpSLVGTkLw/yiNhcB%'
limit 0,20;
</code>Using a delimiter (such as a comma) prevents accidental matches caused by simple string concatenation. The same idea can be applied to other encrypted fields. Ultimately, the choice of method depends on data volume, performance requirements, and consistency guarantees.
macrozheng
Dedicated to Java tech sharing and dissecting top open-source projects. Topics include Spring Boot, Spring Cloud, Docker, Kubernetes and more. Author’s GitHub project “mall” has 50K+ stars.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.