Red‑Blue Technical Attack‑Defense Exercises and SRE Practices at Ant Financial
Ant Financial’s internal red‑blue technical attack‑defense program, driven by a dedicated blue team and SRE‑based red team, continuously probes system weaknesses, refines fault‑injection tools like Awatch, and evolves high‑availability and self‑healing mechanisms to strengthen risk control and operational reliability.
Ant Financial (Alipay) runs a continuous red‑blue technical attack‑defense exercise where the "blue army" actively searches for system vulnerabilities and launches realistic attacks, while the "red army"—comprising SRE and business‑line engineers—focuses on building and maintaining robust defensive platforms.
The initiative began with quality‑focused collaboration in 2013, evolved into a dedicated Technical Risk Department in 2015, and was later reorganized as an SRE team in 2016, responsible for automated fault location, adaptive disaster recovery, anti‑shake mechanisms, and fine‑grained high‑availability at the transaction level.
In 2017 the blue team released Awatch, a bytecode‑level fault‑injection system that can inject arbitrary faults into running services in real time, dramatically expanding the scope of attack scenarios.
The red team responded by developing a real‑time verification platform capable of minute‑level anomaly detection, later extending it with an AI‑enhanced "four‑layer defense" covering over 80% of Ant’s business, and building specialized verification tools for individual domains.
Weekly full‑stack red‑blue drills now generate more than 200 fault scenarios, driving continuous improvement of both offensive and defensive capabilities. The blue team’s aggressive probing forces the red team to harden systems, while the red team’s rapid response and self‑healing architectures reduce human effort and improve reliability.
These practices have been formalized into a mature risk‑control framework that includes disaster‑recovery simulations, automated fault‑scenario generation (up to 500 scenarios in five minutes), and a metrics platform for visualizing attack‑defense outcomes.
Ant Financial also shares several of these risk‑control products publicly, such as a disaster‑recovery platform, end‑to‑end stress testing, and fund‑security monitoring, with more tools slated for release.
AntTech
Technology is the core driver of Ant's future creation.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.