Tag

service level objectives

0 views collected around this technical thread.

Efficient Ops
Efficient Ops
Dec 20, 2023 · Operations

How Bilibili Implements SLO Engineering to Boost Service Reliability

This article details Bilibili's practical SLO engineering approach, covering foundational components, SLI selection, application and business level SLIs, alerting strategies, SLO‑driven quality operations, and the GOC framework for rapid fault discovery, localization, and recovery, illustrating how reliability is systematically improved.

AlertingMonitoringReliability Engineering
0 likes · 16 min read
How Bilibili Implements SLO Engineering to Boost Service Reliability
Efficient Ops
Efficient Ops
Oct 16, 2016 · Operations

Balancing Reliability and Innovation: Google’s SRE Risk Management Explained

This article explores how Google Site Reliability Engineers manage service reliability by balancing risk, cost, and business goals, using metrics like unplanned downtime, availability formulas, and risk tolerance to set realistic SLOs for both consumer and infrastructure services.

GoogleSREavailability
0 likes · 21 min read
Balancing Reliability and Innovation: Google’s SRE Risk Management Explained