Traffic Replay for Functional and Regression Testing: A Case Study of the CaiHuoXia Marketplace
This article explains how leveraging real online traffic for offline replay can simultaneously reduce regression testing costs, ensure functional correctness, and improve test efficiency, illustrated through a detailed case study of the CaiHuoXia marketplace backend development.
Background
During product iteration, functional testing and regression testing are essential; functional testing ensures new changes meet product logic, while regression testing validates that modifications do not break existing features. Manual test cases cannot simulate all user behaviors and are subjective, and full regression testing raises costs.
A practical solution is to use real online traffic for verification.
Recording online traffic and replaying it in a sandbox or test environment helps discover whether new code branches keep the system functional, reducing risk.
Offline regression testing with online traffic maintains comprehensive coverage while saving test costs.
Introduction
Traffic replay uses real online traffic for offline replay testing, saving regression testing costs, guaranteeing code quality, and reducing online incidents.
Traffic Platform Construction Cost
The full‑link traffic platform should be able to:
Record real online traffic.
Handle massive concurrent requests.
Support common protocols.
Be non‑intrusive to the online system.
Offer simple tools that satisfy various replay scenarios.
Building such a platform incurs considerable cost, but for a single project a lightweight RPC‑based traffic replay solution can be implemented quickly.
Application of Traffic Replay
1. Project Background
The CaiHuoXia marketplace added a new backend configuration feature and changed order‑generation logic, requiring verification of the new functionality and regression testing of multiple order scenarios.
Large regression workload demands efficient testing methods.
Order logic must be 100% reliable.
2. Solution
Traffic replay is used for both regression and functional testing.
Regression: replay real online traffic offline and compare order results for non‑bargain items.
Functional: replay traffic with the new backend configuration, keeping online Apollo settings unchanged, to validate subsidy and bargaining logic.
Advantages
Highly customizable traffic cloning aligned with business logic.
Reduces manual regression effort and improves efficiency.
Provides objective test cases, enhancing code quality and reducing incidents.
Core Functions
Traffic Collection
Obtain online product information, highest bid, and order result via cloud window.
Write the collected data into corresponding database fields.
Traffic Replay
The process maps online product floor prices to offline test items, simulates real user behavior in the test marketplace, and writes offline product data and order results to the database, using multithreading and batch inserts for efficiency.
Result Storage
Online and offline data are stored side‑by‑side in the database, facilitating clear analysis.
Result Analysis
Replay the development branch in the test environment, compare expected results with online results for the same product, focusing on order outcome and subsidy amount.
Regression verification showed that offline and online order results matched, confirming that new code did not affect existing non‑bargain order logic and reduced regression time by more than threefold.
Functional verification compared online and offline order results and subsidy amounts, confirming correctness of the new configuration feature.
Some discrepancies were discovered:
Issue 1: Offline items remained in "selling" state – caused by failed auction completion; resolved by retrying after failure.
Issue 2: Online items were ordered while offline items were unsold – caused by mismatched price‑segment settings; resolved by aligning subsidy and bargaining amounts across environments.
After fixing these issues and ensuring configuration parity, replay produced identical order and subsidy results, confirming the correctness of the backend configuration and order logic.
Summary
(1) Traffic replay requirements vary, so flexible validation methods are needed.
(2) The company's underlying code routes traffic and randomizes IPs; test services must be adjusted to run on dedicated test machines.
(3) RPC‑based traffic replay offers rapid short‑term validation, but building a full traffic platform remains important for long‑term development.
Outlook
Beyond the two scenarios described, traffic replay can serve many purposes: persisting traffic for load‑testing platforms, forwarding traffic to developers' environments for feature testing, generating QPS for stress tests, or recording problematic flows for debugging. Future work will continue to expand these applications.
转转QA
In the era of knowledge sharing, discover 转转QA from a new perspective.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.