How to Tackle Common Cache Problems in Distributed Systems

This article explores typical cache challenges in distributed systems—including data consistency, high availability, cache avalanche, and cache penetration—explaining their causes, real‑world scenarios, and practical mitigation strategies to ensure reliable and efficient caching.

ITFLY8 Architecture Home
ITFLY8 Architecture Home
ITFLY8 Architecture Home
How to Tackle Common Cache Problems in Distributed Systems

Outline

Outline

Data Consistency

Cache High Availability

Cache Avalanche

Cache Penetration

Reference Materials

Summary

Data Consistency

Cache sits in front of persistent storage, keeping hot data close to users for faster access and lower latency.

Because cache is a replica of persisted data, inconsistencies can arise, leading to dirty reads or missing data, typically caused by network instability or node failures. Different operation orders produce various inconsistency scenarios.

2.1 Scenario Introduction

(1) Write cache first, then write database.

If the cache write succeeds but the database write fails or is delayed, subsequent concurrent reads from the cache may return dirty data.

(2) Write database first, then write cache.

If the database write succeeds but the cache write fails, subsequent reads may miss the data.

(3) Asynchronous cache refresh.

This scenario considers the timeliness of data writes and cache refreshes, such as how long to refresh the cache without affecting user access.

2.2 Solutions

Scenario 1: Writing the cache before persistence is incorrect; write to the persistent store first, then update the cache.

Scenario 2:

Rollback the database if cache write fails (adds complexity, not recommended).

If reading the cache fails, read from the database and then write back to the cache.

Scenario 3:

Identify which data suits asynchronous refresh.

Determine an acceptable inconsistency window based on experience and user‑visible refresh intervals.

2.3 Other Methods

Set reasonable timeout values.

Periodically refresh data within a defined range (by time or version).

In practice, consistency concerns appear at three levels: between cache and database, among multi‑level caches, and among cache replicas.

Cache High Availability

Industry opinions differ: some view cache as a temporary store that need not be highly available, while others treat it as a critical storage layer requiring high availability.

Whether cache must be highly available depends on the impact on the backend database.

Decision factors include cluster size, cost, and system performance metrics such as concurrency, throughput, and response time.

3.1 Solutions

High availability is typically achieved through distribution and replication. Distributed caching provides massive capacity; replication ensures node‑level availability.

Distribution often uses consistent hashing; replication can be asynchronous.

3.2 Other Methods

Dual‑write replication: both replicas must succeed before the operation is considered successful.

Virtual layer: add a virtual layer before the hash ring to handle ring failures and avoid data skew.

Multi‑level caching: e.g., local cache → distributed cache → distributed cache with local persistence.

Choose the approach based on specific business scenarios.

Cache Avalanche

An avalanche occurs when many cache entries expire simultaneously, flooding the database with requests and potentially overwhelming it.

Mitigation strategies include:

Plan cache expiration times wisely.

Assess database load capacity.

Implement overload protection or rate limiting at the application layer.

Design multi‑level caches to improve availability.

Cache Penetration

When a non‑existent key is repeatedly queried, each miss hits the database, causing unnecessary load.

Solutions:

Cache empty results temporarily and purge them when data becomes available.

Use a Bloom filter or bitmap to pre‑filter keys that are known to be absent.

Reference Materials

MemCache detailed analysis: http://www.mamicode.com/info-detail-1120932.html

Cache‑database consistency guarantees: http://www.36dsj.com/archives/43950

Hash ring and virtual nodes: http://www.111cn.net/sys/linux/58748.htm

Making memcached distributed: http://blog.csdn.net/cutesource/article/details/5848253

Summary

This session covered common cache issues—data consistency, high availability, cache avalanche, and cache penetration—providing practical insights and techniques to address each challenge in distributed systems.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Distributed SystemsCachehigh availabilityData Consistencycache-avalanchecache-penetration
ITFLY8 Architecture Home
Written by

ITFLY8 Architecture Home

ITFLY8 Architecture Home - focused on architecture knowledge sharing and exchange, covering project management and product design. Includes large-scale distributed website architecture (high performance, high availability, caching, message queues...), design patterns, architecture patterns, big data, project management (SCRUM, PMP, Prince2), product design, and more.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.