Designing Practical Encrypted Storage for Sensitive User Data

This article explains why encrypting sensitive fields like passwords, phone numbers, and ID numbers is essential, details a balanced approach using AES‑GCM encryption, HMAC indexes, and masked fields, and shares real‑world lessons from implementing the solution in a production system.

dbaplus Community
dbaplus Community
dbaplus Community
Designing Practical Encrypted Storage for Sensitive User Data

When the team attempted to launch a social app, the platform required a public security filing number, and the review highlighted that storing sensitive fields (passwords, phone numbers, ID numbers, etc.) in plaintext would cause the application to be rejected or even shut down.

Initial Mistakes and Realizations

The original design stored passwords with a hash but left phone numbers, ID numbers, and emails in plaintext. The team assumed that because the database was not publicly exposed, the risk was low, and they relied on front‑end masking to hide data from staff. However, this approach failed compliance checks and later caused many issues during the security review.

Design Goals

Encrypt fields that must be protected.

Maintain queryability for common business scenarios.

Support fuzzy queries where needed without exposing full plaintext.

Allow key rotation without massive data re‑encryption.

Phone Number Encryption Strategy

The team created a dedicated user_phone table with the following columns:

CREATE TABLE `user_phone` (
  `user_id` BIGINT UNSIGNED NOT NULL COMMENT 'User ID, primary key, one‑to‑one with user table',
  `phone_cipher` VARBINARY(128) NOT NULL COMMENT 'Encrypted phone number',
  `phone_iv` VARBINARY(12) NOT NULL COMMENT 'Random IV for AES‑GCM',
  `phone_tag` VARBINARY(16) NOT NULL COMMENT 'AES‑GCM authentication tag',
  `phone_idx` BINARY(32) NOT NULL COMMENT 'HMAC index for equality queries',
  `phone_prefix` CHAR(3) NOT NULL COMMENT 'First three digits for prefix search',
  `phone_last4` CHAR(4) NOT NULL COMMENT 'Last four digits for suffix search',
  `key_version` SMALLINT UNSIGNED NOT NULL DEFAULT 1 COMMENT 'Key version used for encryption',
  `created_at` DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP COMMENT 'Creation time',
  `updated_at` DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP COMMENT 'Update time',
  PRIMARY KEY (`user_id`),
  UNIQUE KEY `uk_phone_idx` (`phone_idx`),
  KEY `idx_phone_prefix_last4` (`phone_prefix`,`phone_last4`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COMMENT='User phone table with encrypted storage and query support';

Key points: phone_cipher, phone_iv, and phone_tag store the AES‑GCM ciphertext, IV, and authentication tag. phone_idx is an HMAC of the plaintext phone number, enabling exact matches without revealing the number. phone_prefix and phone_last4 keep a small portion of the plaintext to support prefix and suffix queries (e.g., find users with phone numbers starting with 138 or ending with 5678). key_version records which encryption key was used, facilitating key rotation.

Name Encryption Strategy

Names are less frequently queried, so the design is simpler: a ciphertext group ( name_cipher, name_iv, name_tag) plus an HMAC index ( name_idx) for exact matches. No prefix or suffix fields are stored because fuzzy name searches are prohibited for security reasons.

ID Card Encryption Strategy

Identity numbers require additional derived fields for business logic (birthdate, gender, area code). The user_identity table stores:

Encrypted ciphertext fields ( pid_cipher, pid_iv, pid_tag).

HMAC index ( pid_idx) for equality queries.

Derived plain fields ( pid_birth, pid_gender, pid_area_code) extracted once during verification, allowing analytics without decrypting the full ID.

Email Encryption Strategy

Email follows a similar pattern: ciphertext fields plus an HMAC index ( email_idx) for exact lookups. Additionally, the domain part is stored in clear text ( email_domain) to enable domain‑based filtering (e.g., all @qq.com users).

Masked Fields for Display

To avoid costly joins and decryption when rendering UI lists, the main user_info table includes masked versions of each sensitive field ( phone_masked, email_masked, pid_masked, name_masked). These are generated at the time of encryption and are read‑only, used only for display in admin panels, customer‑service tools, and user profile pages.

Operational Lessons

Key Management: Storing keys in config files is risky; the team moved to a centralized KMS and wrapped encryption logic in a utility class.

Query Limitations: The design deliberately disallows arbitrary fuzzy searches (e.g., %88% in phone) to reduce leakage risk.

Decryption Controls: Decryption is centralized in a service with audit logging to prevent misuse.

Historical Data Migration: Legacy plaintext data was encrypted via a back‑fill script that processed records in batches, handling encoding issues and missing values.

BI Impact: Because raw data is encrypted, a separate “masked analytics” table was created containing only the derived plain fields needed for reporting.

Export Policies: Default exports provide only masked data; full‑plaintext exports require special approval and are logged.

Conclusion

The solution balances compliance, security, and operational practicality. By encrypting sensitive data with AES‑GCM, using HMAC indexes for searchable fields, storing minimal plaintext fragments for required queries, and adding masked display columns, the team achieved a system that meets regulatory requirements without sacrificing performance or developer productivity.

database designencryptiondata maskingHMACsensitive dataAES-GCMkey rotation
dbaplus Community
Written by

dbaplus Community

Enterprise-level professional community for Database, BigData, and AIOps. Daily original articles, weekly online tech talks, monthly offline salons, and quarterly XCOPS&DAMS conferences—delivered by industry experts.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.