Operations 9 min read

Applying the VALET Pattern Language for SRE Transformation at Home Depot (THD)

The article explains how Home Depot (THD) adopted the VALET pattern language—Volume, Availability, Latency, Error, and Ticket—to unify service‑level objectives, automate data collection, build dashboards, and improve SRE practices across its massive retail and e‑commerce infrastructure.

Continuous Delivery 2.0

Sep 27, 2023

Applying the VALET Pattern Language for SRE Transformation at Home Depot (THD)

This article, originally from Chapter 3 of the English edition of the SRE Handbook , describes how Home Depot (THD), the world’s largest home‑improvement retailer, used the VALET pattern language to drive its SRE transformation.

VALET stands for Volume, Availability, Latency, Error, and Ticket, each representing a key SLO dimension that teams must answer for dependent services.

Volume – how much traffic the service can handle

Availability – can the service be started on demand

Latency – does the service respond quickly enough

Error – does the service raise errors

Ticket – does the service require manual intervention

Initially, THD’s monitoring tools and dashboards were fragmented, making incident diagnosis time‑consuming and causing miscommunication between development and operations teams.

To create a common language, THD introduced a unified SLO framework based on VALET, incorporated it into developers’ OKRs, and built an automated data‑collection pipeline called the “TPS report.” This pipeline captures VALET metrics from logs stored in BigQuery, integrates data from other monitoring systems (e.g., Stackdriver), and stores the results in a Cloud SQL database.

The VALET dashboard visualizes these metrics, allowing users to register new services, set SLO targets for any VALET category, and add custom metric types (e.g., P99 latency, daily transaction volume). The dashboard also supports slicing and dicing data across services, generating weekly or monthly SLO reports, and feeding alerts to chat bots.

THD extended VALET to batch processing workloads by redefining the five categories (e.g., “Capacity” for record volume, “Availability” as percentage of successful runs, “Latency” as job runtime, “Error” as failed records, and “Ticket” as manual fixes). This adaptation enables SLO‑driven reliability for both real‑time services and batch jobs.

The article concludes with an open organizational challenge: translating VALET metrics into business terms that product managers can readily understand, thereby aligning product and engineering goals around shared SLOs.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Monitoring Operations SRE SLO Home Depot VALET

Written by

Continuous Delivery 2.0

Tech and case studies on organizational management, team management, and engineering efficiency

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.