Data Masking (Desensitization) Techniques: Static and Dynamic Approaches
This article explains data masking, its importance for protecting sensitive information, and details both static and dynamic masking methods—including nullification, randomization, substitution, symmetric encryption, mean value, and offset rounding—along with practical examples and implementation considerations.
Hello everyone, I'm Chen~
After receiving strange scam calls claiming to be from a high‑end private club, I realized my personal data had likely been leaked by an internal source, prompting a discussion on how developers can prevent privacy breaches through data masking.
What Is Data Masking
Data masking, also called data de‑identification, transforms sensitive fields such as 手机号 (phone number) and 银行卡号 (bank card number) according to predefined rules so that the data cannot be directly used in insecure environments.
Government agencies, healthcare, financial institutions, and telecom operators were early adopters because they handle highly confidential user data.
In everyday life, data masking appears in e‑commerce platforms like Taobao, where order details hide merchant information with * to protect privacy.
Static Data Masking (SDM)
Static data masking is used to extract production data, mask it, and then distribute the sanitized version to testing, development, training, or analytics environments.
When copying production data to non‑production databases, sensitive information must be masked first to ensure security while still supporting business needs.
The masked data remains isolated from the production environment, satisfying functional requirements without exposing real data.
As shown in the diagram, real user fields such as 姓名 (name), 手机号 , 身份证 (ID number), and 银行卡号 are transformed using techniques like replacement, nullification, shuffling, and symmetric encryption.
Dynamic Data Masking (DDM)
Dynamic data masking operates in real‑time on production systems, masking sensitive data on the fly based on roles, permissions, or other contextual rules.
Note: While removing sensitive content, the masking must preserve data characteristics, business rules, and relationships so that development, testing, and analytics continue to function correctly.
Data Masking Solutions
Masking systems allow custom rule definitions per table and column, enabling non‑persistent masking of sensitive fields.
1. Nullification
Nullification replaces field values with truncation, encryption, or hiding (e.g., using * ), making the data unusable for attackers but also obscuring the original format.
2. Random Values
Random value replacement swaps letters with random letters and digits with random digits, preserving the original data format while making the changes hard to detect.
For example, the name and idnumber fields can be randomized, though name randomization may require a surname dictionary.
3. Data Substitution
Substitution uses a predefined dummy value instead of special characters; for instance, all phone numbers could be set to "13651300000".
4. Symmetric Encryption
Symmetric encryption is a reversible masking method that encrypts sensitive data with a key, producing ciphertext that follows the same logical format as the original; key management is critical.
5. Mean Value
For numeric fields, the mean value approach calculates the average and then distributes masked values around that mean, keeping the total sum unchanged (e.g., masking price values around 60).
6. Offset and Rounding
This method shifts numeric data randomly and rounds it, preserving approximate ranges while enhancing security, useful in big‑data analytics (e.g., changing create_time from "2020-12-08 15:12:25" to "2018-01-02 15:00:00").
In practice, multiple masking techniques are combined to achieve higher security levels.
Conclusion
Both static and dynamic masking aim to prevent internal misuse of private data and ensure that unmasked data does not leave the organization, making data protection a fundamental responsibility for developers.
Code Ape Tech Column
Former Ant Group P8 engineer, pure technologist, sharing full‑stack Java, job interview and career advice through a column. Site: java-family.cn
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.