Backend Development 12 min read

Understanding Caching: Types, Use‑Cases, and Implementation in Backend Systems

This article explains the concept of caching, compares it with buffering, outlines common scenarios such as memory, SSD, static, distributed and local caches, discusses their advantages and limitations, and provides a Java Guava Cache example for practical backend performance optimization.

Architect's Guide
Architect's Guide
Architect's Guide
Understanding Caching: Types, Use‑Cases, and Implementation in Backend Systems

In large‑scale e‑commerce systems, after database sharding and master‑slave separation, disk I/O often becomes the bottleneck, so a faster component—cache—is introduced to reduce response time and improve overall performance.

What Is a Cache

A cache is a component that stores data to accelerate read requests. It is usually placed in memory, though SSDs can be used for cold data (e.g., Pika). Any structure that bridges a speed gap between two hardware layers can be considered a cache.

Typical latency: memory address lookup ~100 ns, disk lookup ~10 ms, meaning memory‑based caches can improve performance by several orders of magnitude.

Typical Cache Scenarios

Operating‑system TLB caches recent virtual‑to‑physical address translations to avoid costly calculations.

Video streaming apps cache portions of upcoming videos to achieve near‑instant start‑up and smooth playback.

Web browsers use HTTP caching (ETag/If‑None‑Match) to avoid re‑downloading unchanged resources, reducing network traffic and page load time.

Cache vs. Buffer

Cache improves read speed and reduces expensive computations, while a buffer is a temporary storage area that smooths data flow between fast and slow devices (e.g., kernel dirty‑page buffers before flushing to disk).

Cache Classification

Static Cache

In the Web 1.0 era, static HTML or Velocity templates were generated and served directly by Nginx or Squid, dramatically reducing backend load for read‑heavy sites.

Distributed Cache

Memcached and Redis are classic distributed caches; clustering them removes single‑node limits and makes them essential for high‑traffic dynamic requests.

Hot‑spot local caches are used when a particular piece of data receives extreme query volume; they reside in the application process to shield both the distributed cache and the database.

Local Cache

Examples include HashMap, Guava Cache, or Ehcache. They run in the same JVM, offering nanosecond‑level access. A typical use case is caching product recommendation data for 30 seconds and refreshing it from the database periodically.

CacheBuilder<String, List<Product>> cacheBuilder = CacheBuilder.newBuilder()
    .maximumSize(maxSize)
    .recordStats();
// set refresh interval
cacheBuilder = cacheBuilder.refreshAfterWrite(30, TimeUnit.SECONDS);

LoadingCache<String, List<Product>> cache = cacheBuilder.build(new CacheLoader<String, List<Product>>() {
    @Override
    public List<Product> load(String k) throws Exception {
        return productService.loadAll(); // fetch all products
    }
});

When fetching product data, the application first queries the local cache; if a miss occurs, the loader pulls data from the database.

Cache Limitations

Effective mainly for read‑heavy, hotspot‑prone data; unsuitable for write‑heavy or uniformly random access patterns.

Introduces system complexity and risk of stale data; short TTLs or explicit invalidation are required.

Memory is finite and costly; large‑scale caching must be carefully sized.

Operational overhead: teams need expertise to monitor, troubleshoot, and maintain cache clusters.

Summary

Caching can be layered: static cache at the load‑balancer level, distributed cache between application and database layers, and local cache inside the application process. The goal is to intercept requests as early as possible because deeper layers handle less concurrency. Cache hit rate is the most critical metric, and caching—whether via faster media or by storing computed results—should be the first consideration when confronting performance bottlenecks.

backendperformancecachingdistributed cachelocal cachememory
Architect's Guide
Written by

Architect's Guide

Dedicated to sharing programmer-architect skills—Java backend, system, microservice, and distributed architectures—to help you become a senior architect.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.