Uncovering the Split‑Lock Chaos that Crashed AMD Servers During Double‑11
A detailed post‑mortem of a high‑priority fault on AMD servers shows how a split‑lock triggered by Python UDFs caused CPI spikes, CPU overload, and bus‑lock events, and explains the investigation steps, code analysis, reproduction tests, and mitigation measures taken to restore stability.
