How MVCC Beats Pessimistic Locks for Distributed Key‑Value Stores
This article examines a distributed system concurrency problem, compares lock‑based transactions with multiversion concurrency control (MVCC), and explains why MVCC often outperforms pessimistic locking in scenarios demanding high read responsiveness and low contention.
Problem
In a distributed system consisting of a central data center D (a key‑value store with HTTP CRUD API) and multiple business processing centers L1, L2 … Ln, each L performs three steps: read a set of keys from D, process the data to produce an updated key‑value set (which may involve different keys), and update D atomically for multiple keys. Without transaction support, concurrent Ls can cause consistency issues, e.g., two Ls reading the same value 100 for key 123 and updating it to 101 and 102 respectively, resulting in the final value being 102 instead of the expected 103.
Solution 1: Lock‑Based Transaction
To achieve serializability, the simplest approach is to add a lock‑based transaction to D: L locks D before processing and releases the lock after completion. D should also enforce a timeout to avoid long‑running locks. This method is easy to implement but locks the entire dataset, leading to large granularity and long lock duration. Even if lock granularity is reduced to per‑key, challenges remain because the set of keys to be updated may be unknown beforehand, potentially causing deadlocks and still suffering from long lock times.
Solution 2: Multiversion Concurrency Control
To achieve serializability without the drawbacks of locking, an optimistic MVCC‑based lock‑free transaction mechanism can be used. In MVCC, reads do not block writes and vice versa; conflicts are detected at commit time, improving concurrency performance. A simple MVCC implementation uses Conditional Update (compare‑and‑swap): an update includes a condition set that must be satisfied for the update to succeed, otherwise an error is returned. This leads to a Try → Conditional Update → (Retry) processing pattern.
Although a single L may not always succeed, the system as a whole makes progress. Conditional Update avoids large‑grain, long‑duration locks, offering good concurrency when resource contention is low. However, large condition payloads can increase network overhead, especially when the condition is much larger than the actual update data.
To mitigate condition size, each data item can carry an integer version number maintained by D; L includes the version in the condition instead of the full value.
If D does not support Conditional Update, a proxy P can be inserted between L and D to perform the condition check, caching data to improve performance. This adds a layer but simplifies cache management because P is the sole client of D. Some key‑value stores like Redis and Amazon SimpleDB already support Conditional Update.
Pessimistic Lock vs MVCC Comparison
Typical scenarios:
High read‑response‑time requirement : In systems like stock trading, pessimistic locks block reads during writes, degrading read latency, whereas MVCC allows reads without blocking, yielding faster, stable read performance.
Read‑heavy workloads : When reads far outnumber writes, locks cause many reads to wait, while MVCC maintains high, stable read concurrency.
Frequent write conflicts : If write conflicts are common, evaluate retry cost. With pessimistic locks, total time ≈ t1 + t2 (serial execution). With MVCC, total time ≈ 2 × t2 (one retry). If retry cost is low, MVCC is preferable; if high (e.g., expensive report generation), locking may be better.
Thus, MVCC suits scenarios demanding high read responsiveness and concurrency, while pessimistic locking is better when retry cost is prohibitive.
Conclusion
The article presents a Conditional Update solution based on MVCC to address concurrency control in distributed systems, avoiding large‑grain, long‑duration locks and better fitting scenarios that require fast read responses and high concurrency.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
ITFLY8 Architecture Home
ITFLY8 Architecture Home - focused on architecture knowledge sharing and exchange, covering project management and product design. Includes large-scale distributed website architecture (high performance, high availability, caching, message queues...), design patterns, architecture patterns, big data, project management (SCRUM, PMP, Prince2), product design, and more.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
