Operations 15 min read

How to Build an Efficient, Secure Test Data Management System for DevOps

This article examines the challenges of test data generation in fast‑moving software projects and presents a self‑developed Test Data Management solution that balances efficiency, usability, security, quality, generality, and cost through adaptive concurrency control, masking policies, and a multi‑master architecture.

SQB Blog
SQB Blog
SQB Blog
How to Build an Efficient, Secure Test Data Management System for DevOps

Introduction

Google's DORA DevOps research identifies test data management as a key capability for software delivery and organizational performance, supporting manual and automated testing as well as full‑link load testing in production environments.

Challenges

Rapid business iteration leads to an ever‑growing number of databases and schema changes, making test data creation and management increasingly difficult. Manual data creation is inefficient and often incomplete, while using production data raises privacy risks and can lose relevance after masking or historical data reuse.

Test Data Management (TDM) vs Test Data Generation (TDG)

Test Data Generation (TDG)

TDG synthesizes data on demand based on modeled characteristics of production data without using real records. While safe, building accurate models at scale is complex and may not guarantee data association.

Test Data Management (TDM)

TDM systems manage copies of production databases, synchronizing multiple sources, versioning, masking or obfuscating sensitive fields, and distributing data across multi‑cloud environments to support agile development, automated testing, and performance testing, albeit with strict security requirements.

Our Approach

To avoid the high cost of commercial TDM tools, we built an in‑house TDM system evaluated against six metrics: efficiency, usability, security, quality, generality, and cost.

Efficiency

Data preparation time depends on sync speed, which is tied to table size. We implemented adaptive concurrency control based on the AIMD algorithm, using the following parameters: initialConcurrency: initial concurrency level maxConcurrency: maximum concurrency minConcurrency: minimum concurrency timeout: MRT threshold indicating overload backoffRatio: factor to reduce concurrency on overload (0.5‑0.9)

Algorithm steps:

Initialize with initialConcurrency.

Increase concurrency by 1 each adjustment cycle until maxConcurrency or overload occurs.

On overload ( MRT > timeout), multiply current concurrency by backoffRatio, not dropping below minConcurrency, then recover.

additive‑increase/multiplicative‑decrease (AIMD) is a feedback control algorithm widely used in TCP congestion control, where linear increase is combined with exponential decrease upon congestion.

Overload signals are based on CPU, IOPS, disk, memory, bandwidth usage, and cluster sync latency.

Results: with a peak concurrency of 1,750, 5.5 million rows were synchronized in about one hour, with CPU and IOPS usage stabilizing after a 6 % increase.

Usability

The platform offers visual masking rule versioning, standardized data request workflows, and snapshot views, making data acquisition intuitive and reducing user errors.

Usability UI
Usability UI

Security

We follow strict data security regulations, implementing compliance checks, appropriate masking algorithms, security audits, access controls, regular evaluations, and team training to protect privacy.

Quality

Stability is achieved through a multi‑master, multi‑worker architecture with Etcd‑based bidirectional synchronization between control plane (CP) and data plane (DP). The system remains operational despite node failures or upgrades.

High‑availability architecture
High‑availability architecture

Domain‑Driven Design (DDD) separates task, monitoring, flow‑control, and data‑source domains into independent service, domain, and infrastructure layers, enhancing maintainability and scalability.

DDD model
DDD model

To preserve data association after masking, we define “business fields” and apply consistent rules across related tables.

Example: the field Params may need the masked value of Cellphone .

Pseudo‑code for contextual offset:

maskedCellphone = Masking(ctx, Cellphone)
ctx = context.WithValue(Cellphone, maskedCellphone)
maskedParams = Masking(ctx, Params)

Cost

We reduce sync volume by filtering data based on primary and foreign key indexes, balancing data availability with database performance.

Generality

The high‑quality test data set supports unit, integration, functional, regression, smoke, and performance testing, as well as pre‑production environment simulation, improving overall software development reliability.

Testing scenarios
Testing scenarios

Conclusion

Our self‑developed TDM system, designed with cost awareness, enhances efficiency, usability, security, quality, and generality, providing a practical solution to eliminate test data bottlenecks, reduce security risks, and achieve long‑term cost savings.

References

DevOps tech: Test data management – https://cloud.google.com/architecture/devops/devops-tech-test-data-management

What is Test Data Management? – https://www.delphix.com/glossary/what-is-test-data-management

和性增长/乘性降低 – https://zh.wikipedia.org/wiki/%E5%92%8C%E6%80%A7%E5%A2%9E%E9%95%BF/%E4%B9%98%E6%80%A7%E9%99%8D%E4%BD%8E

TCP拥塞控制 – https://zh.wikipedia.org/wiki/TCP%E6%8B%A5%E5%A1%9E%E6%8E%A7%E5%88%B6

AIMDLimit – https://github.com/Netflix/concurrency-limits/blob/master/concurrency-limits-core/src/main/java/com/netflix/concurrency/limits/limit/AIMDLimit.java

DevOpssoftware testingConcurrency Controldata securitytest data management
SQB Blog
Written by

SQB Blog

Thank you all.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.