Backend Development 12 min read

Private Deployment Architecture, Challenges, and Solutions for Volcano Engine A/B Testing (DataTester)

This article details the private‑deployment architecture of Volcano Engine A/B Testing (DataTester), outlines three major challenges—version management, performance optimization, and stability—and explains the branch‑logic, release pipeline, model‑optimization, and pre‑aggregation solutions implemented to enable reliable, low‑resource SaaS‑like operation in on‑premise clusters.

DataFunTalk
DataFunTalk
DataFunTalk
Private Deployment Architecture, Challenges, and Solutions for Volcano Engine A/B Testing (DataTester)

Volcano Engine A/B Testing (DataTester) is a B2B product that must support private deployment to meet customers' data‑security and compliance requirements.

The system is built with Ansible and Bash and can be deployed on small three‑node clusters. It is divided into three layers: (1) Business services that provide UI and APIs such as experiment management and OpenAPI; (2) Infrastructure services that support the business layer, hide differences between SaaS and private deployments, and include computation engines and metadata services; (3) The underlying infrastructure platform minibase , which combines host machines and Kubernetes to abstract OS and hardware details.

Challenge 1 – Version Management : Unlike SaaS, private deployment requires a baseline version and per‑service sub‑versions to guarantee environment equivalence, extending the release cycle to a bi‑monthly cadence.

Solution: a branch‑logic where both SaaS and private releases originate from master . During a private release cycle a dedicated private branch is created, merged back after release, ensuring master always works in both environments.

Challenge 2 – Performance Optimization : The reporting engine relies on ClickHouse for real‑time analysis. Private clusters are smaller, leading to higher latency and resource contention when many experiments run concurrently.

Solution: redesign the experiment report model. First, define a clear experiment‑report pipeline; then optimize the data model by moving exposure events into a user‑level attribute table, reducing event‑table size and join cost. A daily user_agg table aggregates per‑user metrics, cutting query time by over 50% for long‑running experiments.

Pre‑aggregation further reduces resource consumption by scanning the event table once per day and generating a compact intermediate table (1/100–1/500 of the original size) used for all subsequent metric calculations.

Challenge 3 – Stability : Private services have complex operational channels, making high availability essential, especially for the traffic‑splitting service.

Solution: a three‑tier storage architecture (in‑memory, Redis cache, relational DB) with a message‑queue‑driven configuration sync, periodic full‑refresh as a fallback, and Redis acting as a MySQL standby to ensure the split service remains functional even if one component fails.

In summary, the private‑deployment of DataTester combines Ansible‑driven provisioning, disciplined branch management, automated CI/CD pipelines, optimized data models, and resilient split‑service design to deliver SaaS‑level functionality on on‑premise clusters.

Volcano Engine A/B Testing (DataTester) originated from ByteDance internal tools, incorporates extensive B‑to‑B experience, and continues to evolve through private‑deployment feedback to create value for both internal and external customers.

backendperformance optimizationarchitectureA/B testingVersion ManagementPrivate Deployment
DataFunTalk
Written by

DataFunTalk

Dedicated to sharing and discussing big data and AI technology applications, aiming to empower a million data scientists. Regularly hosts live tech talks and curates articles on big data, recommendation/search algorithms, advertising algorithms, NLP, intelligent risk control, autonomous driving, and machine learning/deep learning.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.