Databases 17 min read

Linearizability, Serializability, and TrueTime in Google Spanner

This article explains the concept of linearizability, contrasts it with serializability, and describes how Google Spanner uses the TrueTime API and commit‑wait mechanisms to provide external consistency and reliable snapshot reads in a globally distributed database system.

High Availability Architecture

Jul 6, 2017

Linearizability, Serializability, and TrueTime in Google Spanner

As data and computation scales exceed the capacity of single machines, distributed systems become necessary, but they must preserve the correctness of results as if executed on a single node. Linearizability guarantees that concurrent operations appear to occur instantaneously in an order consistent with real‑time, while serializability concerns transaction ordering for concurrency control.

The article illustrates linearizability with two scenarios: a history where operations can be reordered into a sequential history that respects real‑time order (linearizable), and a history that cannot be reordered without violating that order (non‑linearizable). Example histories are shown as code blocks:

start[W(1)]A
start[R(0)]B
end[W(1)]A
end[R(0)]B
start[R(1)]B
end[R(1)]B

It then discusses how a globally deployed system must assign a single, monotonically increasing timestamp to each transaction, a task complicated by clock skew across datacenters. Simple local‑clock approaches can invert the real‑world order of timestamps, leading to inconsistency.

Google Spanner solves this problem with the TrueTime API, which combines GPS and atomic clocks to provide a bounded time interval TT.now() (earliest, latest). Spanner assigns transaction timestamps from the upper bound of this interval and enforces a commit‑wait until the timestamp is safely in the past, ensuring external consistency (linearizability).

TrueTime’s architecture includes multiple time masters per datacenter and time‑slave processes that synchronize every 30 seconds, using Marzullo’s algorithm to discard outliers. The overall error is bounded between 1 ms and 7 ms, and Spanner’s commit‑wait typically waits about twice the error (~8 ms).

Spanner’s storage layer consists of zones, spanservers, tablets, and Paxos groups. Each key is stored as (key, timestamp) → value, enabling multi‑version storage. Transactions spanning multiple Paxos groups use two‑phase commit, while single‑group transactions rely on lock tables for serializability.

Snapshot reads in Spanner allow clients to read a consistent view of the database at a chosen timestamp t, provided t ≤ tsafe. If t exceeds tsafe, the system waits until the timestamp becomes safe.

In summary, Spanner’s TrueTime API and commit‑wait mechanism give the system a globally comparable timestamp space, making linearizability achievable across continents while supporting external consistency, snapshot reads, and high availability.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

distributed systems Databases consistency Linearizability TrueTime Spanner

Written by

High Availability Architecture

Official account for High Availability Architecture.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.