How Huolala Built a Robust Backend Testing Framework to Cut Defects by 40%
This article details Huolala's comprehensive server‑side quality assurance strategy—covering code‑branch checks, change testing, regression, canary releases, and monitoring—to improve stability, reduce financial loss, and achieve a 40% defect‑rate reduction across its microservice architecture.
Background and Challenges
Server‑side quality assurance ensures that backend applications meet performance and reliability expectations throughout development, deployment, and operation. A comprehensive testing strategy improves system stability, security, and user satisfaction.
Huolala’s backend adopts a microservice architecture with a complex, divergent interaction chain.
The complexity brings three main challenges:
Support rapid iteration with high efficiency and high quality.
Maintain high stability—performance meets targets and the system remains robust under failures.
Zero financial loss—pricing accuracy and data consistency across the financial chain.
Where Do Bugs Come From?
Understanding the development perspective helps identify which code changes may introduce bugs.
Testing Strategy Overview
The overall strategy consists of five major categories that interlock to form a complete assurance loop.
1. Code Branch Testing
Goal
All services deployed to test or production must contain code from the master branch.
The master branch code always matches the code running in production.
Strategy
Define branch‑management standards and enforce them with an automated code‑branch detection tool integrated into the CI/CD pipeline, creating a sustainable, closed‑loop guarantee.
2. Change Testing
Goal
Ensure every backend feature works as expected.
Strategy
Divided into three parts: functional testing, loss‑prevention testing, and stability testing.
2.1 Functional Testing
Integration & API tests are designed by testers based on business changes and code diffs, aided by the Precise Test tool. Unit tests ( Huolala’s unit‑test practice ) are written by developers for changed classes or modules.
Unit test: verify a single function works correctly.
Integration test: check that multiple functions cooperate properly.
API test: validate request parameters, responses, and error handling, assisted by Data Factory and MOCK tools.
2.2 Loss‑Prevention Testing
Pricing chain : Use the traffic‑replay tool ( Huolala traffic replay system ) to record online pricing traffic, replay after code/config changes, and compare results to ensure business expectations and avoid price‑related complaints.
Financial data : Use the Data Verification tool to model relationships (DB‑DB, DB‑ES, DB‑API, etc.). The tool monitors binlog changes and validates data consistency in real time, preventing mismatches and accounting errors.
2.3 Stability Testing
Performance : Conduct load and stress tests with JMeter; Huolala is automating full‑link performance testing to reduce test time and human error.
Robustness : Apply the fault‑exercise system to simulate failures in interfaces, services, and middleware, ensuring rapid recovery.
Network communication handling: manage connection drops, timeouts, and corrupted packets.
Data consistency & integrity: support transactions, perform strict input validation.
System stability & fault tolerance: implement comprehensive exception handling, redundant deployment, and real‑time monitoring with alerts.
Service degradation & rate limiting: downgrade non‑critical services and limit traffic during overload.
Dependency management: monitor third‑party services and manage version conflicts.
3. Regression Testing
Goal
Avoid incremental code changes from affecting existing service logic.
Strategy
Two parts: per‑service replay testing and end‑to‑end (E2E) testing.
3.1 Replay Testing
During detailed testing, record full traffic, extract effective samples with a code‑coverage tool, and reuse them for regression of changed services.
3.2 End‑to‑End (E2E) Testing
Simulate real user flows by sending requests through the full stack—from client to network to backend—and verify correct responses, covering the entire backend workflow. Combined with traffic replay, this replaces legacy interface automation while reducing manual effort.
Test cases are automated with TestNG; high‑priority (P0) modules serve as health checks for both online and offline environments, while other modules can be triggered quickly via bots for smoke, regression, or emergency verification.
4. Gray (Canary) Testing
Goal
Avoid releasing new features that receive poor user feedback or cause runtime failures for all users.
Strategy
Deploy multiple versions in the “Small Huolala” environment, selecting specific cities, driver groups, or app versions for gradual rollout, enabling rapid error detection and controlled risk.
5. Monitoring and Alerting
Goal
Detect and resolve faults quickly (e.g., interface errors, service crashes).
Predict and prevent potential risks such as resource bottlenecks or performance degradation.
Strategy
Beyond the CI team’s monitoring platform, the testing team built a Log Sentinel tool that watches for code exceptions in real time, matches them against an experience‑base knowledge graph, and delivers precise, timely alerts.
Results and Benefits
The five testing strategies have been largely automated, turning quality assurance from a “nanny” model into a “coach” model, delivering measurable gains:
40% reduction in defect rate per thousand lines of code – developers can close loops themselves using unit tests, branch checks, and Log Sentinel.
Test quality improved to ≤1% – online defect ratio (online/(online+offline)) drops dramatically.
Testing efficiency up by 30% – developers handle more business guarantees within fixed time, especially in mid‑platform services.
The following pie chart shows the distribution of defect sources for Huolala freight (Feb‑Jul 2024):
Future Plans and Reflections
Future Plans
1.1 Continue automating test methods with tools and pipelines wherever possible.
Example: code‑branch testing described in Chapter 3.
1.2 Leverage AI for recommendations and decision support where tools fall short.
Examples: intelligent issue localization, test‑case recommendation.
Reflection
Will testing be replaced?
The author believes testing will not be eliminated. New technologies create new testing needs; the core competitive edge remains the testing methodology, with code and AI serving as supporting pillars. Continuous enrichment of testing methods is essential.
Thank you for reading; further detailed introductions of the tools mentioned will be shared in upcoming articles.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
