How Yidun Automates Performance Testing to Overcome Real‑World Pain Points
This article explains performance testing fundamentals, why it matters, the specific challenges Yidun faced such as complex execution, human‑dependent monitoring, data isolation, and cost loss, and describes their automated, gradient‑based testing platform with quantified monitoring and future visualisation plans.
Performance Testing Overview
Performance testing applies pressure to a system to measure response time, throughput, QPS and RT, ensuring the system meets user demand after launch.
Why Perform Performance Testing
Real‑world incidents such as the 12306 ticketing site crash during Spring Festival and a Weibo outage during a celebrity news surge illustrate the necessity of verifying system capacity under heavy load.
Yidun Performance Testing Pain Points
Complex Execution Process
Yidun uses gradient testing, starting from a low QPS (e.g., 20) and incrementally increasing to the target (e.g., 200). Each step runs for about 10 minutes; if the target is not reached, testing stops for diagnosis, otherwise the next gradient begins.
Human‑Dependent Monitoring
Effective monitoring requires expert knowledge of the system. On‑call staff cannot watch all metrics continuously, leading to missed alerts and delayed issue detection.
Lack of Data Isolation
Test traffic mixes with real traffic, causing Kafka backlog and affecting real‑user experience; therefore data storage is often disabled during tests.
Testing Cost Loss
Insufficient evaluation of external vendors caused unnecessary expenses during testing.
Automation Practices
One‑Click Test Execution
Creating a test task automatically generates multiple gradient sub‑tasks (e.g., QPS targets 40, 80, 120, 160, 200) which run sequentially; failure at any step aborts the whole test.
Quantified Monitoring & Analysis
Define monitored applications and metrics, collect data via Sentinel API, store it in a database, and flag abnormal supplier request volumes for immediate test termination.
Full‑Link Test Component Integration
A full‑link testing component isolates test traffic from real traffic, and a shadow‑queue switch stops Kafka consumption of test data when needed.
Automation Platform Architecture
Yidun built a dedicated performance testing automation platform that orchestrates gradient tasks, monitoring, and data isolation.
Results and Future Plans
Initial low‑traffic runs show clear advantages over conventional tests. Future work includes visual dashboards, smarter monitoring that collects data only from the test‑affected topology, and statistical confidence intervals for alert thresholds.
NetEase Smart Enterprise Tech+
Get cutting-edge insights from NetEase's CTO, access the most valuable tech knowledge, and learn NetEase's latest best practices. NetEase Smart Enterprise Tech+ helps you grow from a thinker into a tech expert.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
