Operations 4 min read

An Overview of the Google SRE Workbook and Core SRE Foundations

The article introduces the Google SRE Workbook as a practical supplement to the original SRE book, explains the five core SRE foundations—including SLO, SLI, SLA, monitoring, and real‑world case studies from Google and Kingsoft Office—while also promoting an upcoming SRE‑DevOps live session.

DevOps
DevOps
DevOps
An Overview of the Google SRE Workbook and Core SRE Foundations

Since the first Google SRE book was published in October 2016, it attracted great interest but lacked concrete examples; therefore, Google released a second book in July 2018, The Site Reliability Workbook (Chinese title: 《Google SRE工作手册》), which provides numerous real‑world cases from Google and other companies.

The workbook covers three major sections—foundations, practice, and processes—revealing how Google applies engineering thinking to solve operational problems.

1. The Five Foundations of SRE

It begins with an explanation of why implementing Service Level Objectives (SLOs) is essential: they enable data‑driven decisions on reliability work, help prioritize tasks, and ensure sufficient reliability.

The three key concepts are defined:

Service Level Indicator (SLI): a metric that measures system health, such as latency, traffic, errors, and saturation.

Service Level Objective (SLO): the target value for an SLI over a specific time window (e.g., 99.9% success rate for API calls).

Service Level Agreement (SLA): a contract that specifies compensation when an SLO is not met.

When defining an SLO, three questions should be asked: can the SLO measure service stability, is it achievable with current resources, and will meeting it improve user satisfaction?

2. Monitoring as a Core Foundation

Monitoring data is used to assess expected changes, dependencies, saturation, traffic status, and to implement SLOs effectively.

3. SRE Practice at Kingsoft Office

The article shares how Kingsoft Office applies SRE principles in its operations, illustrating the practical adoption of the workbook’s guidance.

4. Event Promotion

Finally, the piece invites readers to join an evening live stream titled “How SRE Practices DevOps,” featuring Liu Feng from the China SRE community, with details provided via a QR code.

operationsDevOpsSREGoogleSLOSite Reliability EngineeringSLI
DevOps
Written by

DevOps

Share premium content and events on trends, applications, and practices in development efficiency, AI and related technologies. The IDCF International DevOps Coach Federation trains end‑to‑end development‑efficiency talent, linking high‑performance organizations and individuals to achieve excellence.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.