Transparent Data Masking with AutoProxy Middleware at AutoHome
This article describes AutoHome's data security challenges in the big‑data era and explains how the self‑developed AutoProxy encryption middleware provides transparent, compliant data masking across legacy and new sensitive data, reducing cost, improving performance, and enabling automated masking workflows.
With the rise of the big‑data era, information security and personal privacy protection have become critical concerns, especially for large online platforms like AutoHome that store massive amounts of sensitive user data such as ID numbers and phone numbers.
Traditional API‑based data masking at AutoHome suffered from high integration costs, lack of SQL compatibility, and invasive code changes, making it difficult to meet the growing compliance requirements imposed by laws such as the Cybersecurity Law and the Personal Information Protection Law.
To address these issues, AutoHome's database team developed a transparent encryption middleware called AutoProxy . AutoProxy intercepts database traffic, automatically encrypts/decrypts sensitive fields without requiring changes to application code, and supports custom algorithms (MD5, AES, RC4, SM3/SM4) as well as the original API‑compatible algorithm.
The team selected the open‑source Sharding‑Proxy as the underlying proxy framework, extending it to implement the required encryption logic. This choice balanced functionality, stability, operational cost, and ecosystem support.
AutoProxy enables two main masking scenarios:
Application‑level transparent masking : By simply changing the database address to point to AutoProxy, all new data are automatically encrypted, and existing applications continue to operate unchanged.
Historical data masking : A step‑by‑step process adds ciphertext columns, routes writes through AutoProxy, refreshes existing plaintext data into ciphertext, and finally switches to ciphertext‑only mode, all without downtime.
Performance testing showed that AutoProxy’s masking throughput is roughly twice that of the previous API‑based solution, while maintaining full SQL compatibility.
Beyond the middleware, AutoHome built an automated masking platform that handles component deployment, high‑availability, work‑order processing, and monitoring, significantly improving operational efficiency.
In 2022, leveraging AutoProxy, AutoHome successfully masked nearly 10,000 sensitive data tables, achieving complete ciphertext storage for user information and enhancing overall data security.
The article concludes with a summary of the achieved benefits and outlines future directions, including automated sensitive‑data discovery and containerization of AutoProxy.
HomeTech
HomeTech tech sharing
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.