Google Production Environment: Network, Data Center, Cluster Management, Storage, Monitoring, and Deployment Workflow
The article explains Google’s end‑to‑end production infrastructure—including the edge network, data‑center hierarchy, Borg‑based cluster management, storage systems like Colossus and Spanner, monitoring with Borgmon, inter‑task RPC via Stubby, and the code‑to‑production pipeline using Piper, Blaze, Rapid, and Sisyphus—illustrating how requests travel from users to services in milliseconds.
This is a summary of a Google SRE engineer’s May 2018 talk that outlines Google’s overall website infrastructure and code‑release process; for deeper details see the book *Site Reliability Engineering*.
Original video link: https://youtu.be/dhTVVWzpc4Q
Network Layer
Google Edge network connects globally distributed data‑centers via the B4 backbone.
Inside a data‑center, the Jupiter switch aggregates hundreds of physical switches into a single logical switch with about 1.3 PB/s bandwidth.
Incoming web requests reach a Google Front End (GFE) which acts as a reverse proxy and forwards the request to internal services.
The Global Software Load Balancer (GSLB) performs load balancing on three levels:
Geographic level (e.g., requests to google.com).
Product level (e.g., Maps, YouTube).
RPC level.
Thus a user request traverses the ISP, Google edge network, GFE, GSLB, Jupiter, and finally reaches the target service within a few milliseconds.
Data Center
The data‑center hierarchy is:
Campus – may contain multiple data‑centers (similar to a region or availability zone).
Data‑center – contains multiple clusters.
Cluster – consists of many rows.
Row – holds many racks.
Rack – holds many machines.
Campus > Data‑center > Cluster > Row > Rack > Machine
Cluster Management
Clusters are managed with jobs and tasks (similar to Netflix Titus). Engineers submit jobs via an internal tool; Borg, Google’s container manager, schedules tasks onto machines. If a task fails, Borg automatically retries.
Borg Name Service (BNS) provides a hierarchical naming scheme
/bns/<cluster>/<user>/<job name>/<task number>that maps to IP:port addresses and stays synchronized.
Lock Service
The BNS‑to‑IP mapping is stored in Chubby, a distributed lock service that also offers a file‑system‑like API and achieves consensus via the Paxos algorithm.
Storage System
HDD + SSD – physical storage devices.
D – a disk service that provides temporary storage for running jobs.
Colossus – Google’s distributed file system built on GFS, offering durability, replication, and encryption.
Bigtable – a NoSQL database that stores ordered data with eventual consistency across clusters.
Spanner – a NewSQL database that combines relational semantics with NoSQL‑scale horizontal scalability.
Through Borg, a cluster can behave like a single machine, running jobs and storing data, but failures must be handled (e.g., a job interrupted by a machine crash).
Monitoring
Borgmon, the monitoring tool for Borg, collects task status from many levels, aggregates it to a global Borgmon, and forwards metrics to Google’s time‑series database and alert manager. High error rates trigger alerts.
Prober periodically sends requests to task‑hosting servers to measure response latency, providing another health‑check perspective; its data also feeds into Borgmon.
Inter‑Task Communication
Tasks communicate via Stubby, an RPC service built on HTTP and protobuf. In practice, protobuf messages are used for RPC between tasks.
From Code to Production
Google uses a monolithic repository managed by Piper (similar to Git). By 2016 Piper stored ~1 billion files, 20 billion lines of code, 9 million source files, ~86 TB of data, and 35 million commits.
When code is submitted, engineers write a changelist and undergo review via Rietveld. After approval, a presubmit check runs static analysis; passing code is committed.
Blaze (open‑source counterpart Bazel) builds the code into binaries, requiring engineers to specify outputs and dependencies.
Continuous testing runs automated tests on the latest repo revision. Upon success, the Rapid tool invokes Blaze to produce an MPM (Midas Package Manager) package, which includes name, version, and a signature for authenticity.
Sisyphus, Google’s deployment system, takes the MPM package and performs the actual deployment, supporting strategies such as immediate or canary roll‑outs.
Summary
Piper for code submission, Blaze for building.
Rapid creates signed MPM packages.
Borg (with Chubby) runs jobs and stores data.
User requests travel through edge, GFE, B4, GSLB, Jupiter to reach a server.
Task communication uses protobuf and Stubby.
Spanner serves as the primary database.
Borgmon and Prober provide monitoring and alerting.
Future feature releases use Continuous testing, Rapid, and Sisyphus.
Source: http://allenlsy.com/google-production-environment
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Architecture Digest
Focusing on Java backend development, covering application architecture from top-tier internet companies (high availability, high performance, high stability), big data, machine learning, Java architecture, and other popular fields.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
