How Red‑Blue Drills Boost Securities Ops: From Capacity Testing to Full‑Scale Automation
Lin Ying, a senior test manager at Guoxin Securities, shares insights from his GOPS 2021 talk on the securities industry's digital transformation, current IT challenges, and a comprehensive red‑blue exercise strategy that combines full‑link load testing, automated workflows, and proactive monitoring to ensure system stability during market peaks.
1. Digital Transformation of Securities
In recent years, the securities industry has undergone a digital transformation driven by internet disruption and regulatory pressure, moving toward self‑service, intelligent, and agile solutions that improve convenience, efficiency, and personalized wealth management.
2. IT Landscape in Securities
Regulators and users demand high stability; a five‑minute outage can trigger severe penalties. Market volatility creates unpredictable load spikes, capacity awareness is limited, and complex architectures make fault isolation difficult, prompting a shift to a dual‑state IT system.
3. Red‑Blue Exercise for Application Operations
Red‑blue drills simulate peak market traffic to uncover maximum capacity and bottlenecks, while the blue side builds comprehensive monitoring and fault‑location tools to improve emergency response.
Red Team Design
The design focuses on automated, platform‑based, and standardized load‑testing processes, using full‑link traffic replay derived from big‑data user distribution analysis. The solution consists of four parts:
Data construction using a big‑data platform to model user distribution.
Test cluster with two‑site three‑center deployment supporting dynamic horizontal scaling.
Application flow covering access layer, channel, application, middleware, and backend.
Monitoring system built on Prometheus and big‑data for full‑stack visibility.
Security measures include hardware whitelist and configurable application‑layer interception. Automation links environment preparation, recovery, and hardware checks.
Data for load generation is extracted from production, desensitized, and used to build realistic traffic models.
Tool selection criteria: high concurrency, multi‑protocol support (HTTPS, HTTP, private TCP), extensibility, low learning curve, and open‑source cost‑effectiveness. Deployment follows a two‑site three‑center architecture with container‑based dynamic scaling.
During testing, mechanisms such as traffic tagging, middleware adaptation, and shadow databases isolate test data from production.
Blue Team Design
The blue side focuses on fault prevention, detection, localization, and remediation through lifecycle capacity management, continuous baseline updates, and alerting.
Monitoring spans four layers: data collection, processing, big‑data aggregation, and event handling, enabling rapid incident response and post‑mortem analysis.
Fault localization proceeds from system‑level metrics to component‑level details and finally to individual user transactions.
When unexpected traffic surges occur, strategies include auto‑scaling, cost reduction, traffic limiting via custom frameworks, and rapid service degradation.
4. Outcomes at Guoxin Securities
Full‑link load testing is used in semi‑annual production drills and monthly disaster‑recovery exercises, covering over 10 application systems, hundreds of components, and thousands of hosts. System tuning increased primary data‑center throughput threefold, and new data centers achieved 4‑5× performance gains, successfully handling the 2020 market peak.
The capacity awareness journey progressed through four stages: ignorance, desperation, enlightenment, and stable plateau, culminating in proactive capacity baselines, automated testing pipelines, and refined fault‑location expertise.
Efficient Ops
This public account is maintained by Xiaotianguo and friends, regularly publishing widely-read original technical articles. We focus on operations transformation and accompany you throughout your operations career, growing together happily.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.