Tag

system-outage

0 views collected around this technical thread.

Efficient Ops
Efficient Ops
Nov 14, 2024 · Operations

Why Alipay Crashed: Lessons on Backup and Disaster Recovery

The recent Alipay outage during Double‑11 revealed a partial failure in its system message database, prompting users to experience payment errors, duplicate charges, and delayed withdrawals, while the company’s response highlighted the importance of comprehensive backup, redundancy, disaster‑recovery planning, monitoring, and security measures to ensure service continuity.

AlipayOperationsSRE
0 likes · 10 min read
Why Alipay Crashed: Lessons on Backup and Disaster Recovery
IT Services Circle
IT Services Circle
Sep 27, 2024 · Operations

Analysis of the Shanghai Stock Exchange Outage and System Design Lessons

The article recounts the Shanghai Stock Exchange’s sudden P0 outage that halted trading, analyzes the causes such as massive order volume and system bottlenecks, and discusses how distributed architectures and message‑queue based queuing can mitigate similar high‑concurrency failures.

Distributed SystemsOperationshigh concurrency
0 likes · 6 min read
Analysis of the Shanghai Stock Exchange Outage and System Design Lessons
Java Captain
Java Captain
Nov 30, 2023 · Operations

Analysis of Didi's November 2023 System Outage and Potential Technical Causes

The article reviews Didi's late‑November 2023 service disruption, detailing the timeline of failures, official apologies, and expert analyses of six possible technical causes—including software bugs, server issues, third‑party failures, DDoS, other attacks, and ransomware—while highlighting the role of a Kubernetes upgrade and cost‑cutting pressures.

DidiIncident AnalysisKubernetes
0 likes · 7 min read
Analysis of Didi's November 2023 System Outage and Potential Technical Causes