Operations 11 min read

Understanding Throughput, Concurrency, and Lock Contention in System Design

Throughput measures the rate at which an application processes tasks, distinct from concurrency, and can be improved by reducing task latency, increasing parallelism, and optimizing lock usage through finer granularity, lower cost, and techniques like buffering, merging, and batch processing to mitigate contention and enhance scalability.

Fulu Network R&D Team

Jul 6, 2021

Understanding Throughput, Concurrency, and Lock Contention in System Design

What Is Throughput

Throughput refers to the rate at which an application processes tasks, describing how many tasks can be handled per unit of time.

For example, if an application can process 3 tasks in 1 second, its throughput is 3 tps.

Note that throughput is a rate metric; it does not imply concurrency. An application that processes 3 tps may handle the tasks sequentially rather than simultaneously.

If the system has two workers, each capable of processing 3 tasks per second, the overall throughput becomes 6 tps.

Ways to increase throughput include:

Shortening the processing time of each task.

Allowing more tasks to be processed concurrently – increasing parallelism.

An example system that can handle three tasks simultaneously and halves the per‑task processing time achieves a throughput of 18 tps.

Shared Resources and Locks in Parallel Execution

When tasks need to read or modify shared resources—such as inventory counts—concurrent access can cause consistency problems. Two tasks might read the same stock value of 1, both succeed in validation, and after one decrements the stock to 0, the other also decrements, resulting in a negative stock.

Locks are commonly used to isolate operations on shared resources, ensuring data correctness.

Typical lock usage: acquire the lock before accessing the resource, perform the operation, then release the lock so other tasks can proceed.

While other mechanisms (e.g., atomic operations) exist, the goal remains to isolate resource access.

Lock Contention and Wait Time

Contention creates wait time because tasks must wait for the lock to be released.

This waiting increases latency and reduces the effective parallelism of the system.

Consequences of wait time:

Task processing time is extended, manifesting as higher latency.

More parallel tasks lead to longer wait times, degrading parallel processing capability.

Reducing Lock Cost

Lowering the Expense of Using Locks

Lock creation, acquisition, release, and destruction incur overhead. Switching from database locks to Redis locks or even in‑memory locks can reduce this cost.

Example: move inventory data to Redis and perform decrements there.

Shortening Lock Hold Time

After acquiring a lock, release it as quickly as possible; avoid long‑running operations such as HTTP calls, expensive SQL queries, or time‑outs while holding the lock.

However, if you increase parallelism, wait time may still grow.

Using Finer‑Grained Locks

Employing more granular locks reduces the probability of contention and thus the amount of waiting.

Example: split inventory into multiple buckets (e.g., 10 × 10). Tasks randomly or strategically pick a bucket, reducing the chance that many tasks contend for the same resource.

Both lowering lock cost and using finer granularity aim to reduce contention, cut wait time, and improve parallel processing capacity.

Buffer‑Merge‑Process (Batching) Technique

Example for inventory decrement:

Enqueue decrement requests.

Periodically dequeue multiple requests and merge them into a single batch operation.

Process the merged batch.

This increases individual task latency because tasks wait in the queue, but it raises overall throughput by trading latency for higher processing rate.

Overall Approach

1. Identify contention points.

2. Reduce contention and wait time.

3. Apply Buffer‑Merge‑Process only after the first two steps.

These methods each have costs and side effects and are often combined, but priority should be given to resolving lock contention.

Shared‑resource contention is a serious quality signal; even if its impact seems minor now, it can become a hidden risk.

It makes the application performance fragile: reducing lock cost and hold time improves performance, while increased lock overhead or unexpected expensive operations can cause dramatic throughput drops.

Scalability Issues Under Resource Contention

When developing applications, throughput requirements fall into three categories:

Just enough to meet current demand.

More than sufficient—design for a high target (e.g., design for 1000 tps when only 30 tps are needed).

Current demand is met, but the system must be able to scale cost‑effectively to higher throughput in the future.

The third scenario reflects a scalability requirement: the system need not be ultra‑fast now, but it must be able to increase throughput quickly when business needs grow or environments change (e.g., new features, cloud migration).

Thus, addressing lock contention early is crucial for maintaining performance, reducing latency, and ensuring the system can scale smoothly.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Scalability Throughput Locks Parallelism

Written by

Fulu Network R&D Team

Providing technical literature sharing for Fulu Holdings' tech elite, promoting its technologies through experience summaries, technology consolidation, and innovation sharing.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.