A Timeline Review of Optimistic Concurrency Control (OCC) from Theory to Production Systems
This article presents a chronological overview of Optimistic Concurrency Control (OCC), covering its early theoretical foundations, key research papers, prototype implementations such as MVCC+OCC+2PC and Hekaton, and its adoption in modern distributed NewSQL databases like Megastore, F1, and MaaT, highlighting both advantages and challenges.
The article begins by introducing Optimistic Concurrency Control (OCC) as a concurrency‑control philosophy that validates serializability only at transaction commit time, contrasting it with pessimistic locking.
Theoretical research – centralized OCC traces the first OCC proposal to H.T. Kung’s 1981 paper, describing its three phases (read, validation, write) and the five drawbacks of lock‑based protocols that motivated the optimistic approach.
Theoretical research – distributed OCC summarizes early attempts to extend OCC to distributed databases, noting the need for global timestamps and the split of validation into per‑node sub‑transactions, as discussed in 1981‑1982 papers.
The author notes that the survey focuses on high‑level ideas and may contain omissions or errors.
BOCC and FOCC distinguishes two verification schemes: BOCC (basic OCC) validates read‑set conflicts with write‑sets, while FOCC validates write‑set conflicts with read‑sets, each with its own pros and cons.
Prototype systems – MVCC+OCC+2PC outlines a design that combines multi‑version storage, OCC validation, and two‑phase commit, using global timestamps to achieve consistent reads and serialized writes.
Dynamic timestamp adjustment describes a method that narrows transaction commit‑timestamp intervals during validation to reduce unnecessary aborts.
Production systems – Megastore details Google’s Megastore implementation, which uses per‑entity‑group OCC with Paxos‑based commit, serializing transactions within an entity group but limiting throughput.
Production systems – Hekaton explains Microsoft’s in‑memory database that implements OCC with per‑transaction read/write sets, version visibility checks, and a prepare phase that validates read‑set consistency before committing.
Production systems – F1 notes that Google’s F1 adopts row‑level timestamps for optimistic transactions, offering benefits such as tolerance for long transactions and server‑side retry, while still facing phantom‑read challenges.
Production systems – MaaT presents a cloud‑native OCC design that eliminates 2PC locks, uses global timestamp intervals, and employs BOCC validation across shards to achieve high scalability.
Production systems – Centiman describes a KV‑store based OCC system that uses watermarks and per‑shard validation to provide serializable transactions in a distributed environment.
The article concludes by emphasizing that OCC has evolved from a theoretical concept to a practical technique in memory and distributed NewSQL databases, offering reduced synchronization overhead and better resource utilization, especially in low‑conflict workloads, while still requiring careful handling of phantom reads, long‑transaction abort rates, and distributed validation.
AntTech
Technology is the core driver of Ant's future creation.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.