Backend Development 32 min read

How Nginx proxy_cache Can Reduce Response Time from 800 ms to Sub‑5 ms

This guide explains how to use Nginx's proxy_cache to dramatically cut page latency by caching repeated responses, covering background, cache architecture, configuration steps, best‑practice key design, common pitfalls, security tips, performance tuning, monitoring, and backup strategies for high‑traffic web services.

Ops Community

Jan 7, 2026

How Nginx proxy_cache Can Reduce Response Time from 800 ms to Sub‑5 ms

Overview

In early 2024 a PHP‑rendered content platform suffered ~800 ms average response time and frequent backend alerts during traffic spikes. Adding more application servers did not help because the bottleneck was in database queries and page rendering. Adding proxy_cache at the Nginx layer reduced response time to <5 ms and cut backend request volume by about 90 %.

What is Nginx proxy_cache ?

proxy_cache

is a reverse‑proxy cache built into the standard Nginx distribution. On the first request Nginx forwards the request to the upstream, stores the full HTTP response on local disk (or memory), and subsequent identical requests are served directly from the cache, eliminating network round‑trips. It is ideal for read‑heavy, write‑light workloads such as news sites, product detail pages, static assets, and API data.

Key technical advantages

High performance : local storage gives millisecond‑level latency.

Flexible configuration : cache can be controlled by URL, header, cookie, etc.

Bandwidth saving : supports conditional requests (304 Not Modified).

Hierarchical storage : memory + disk layers keep hot data in RAM and long‑tail data on disk.

Cache validation : optional backend revalidation to keep data fresh.

Native support : part of the standard Nginx distribution, no extra modules required.

Typical use cases

Static pages – long‑term cache, response <1 ms.

API endpoints – short‑term cache, up to 80 % fewer backend calls.

Images / videos – long‑term cache, saves bandwidth and CPU.

Product detail pages – balanced freshness and performance.

Search results – parameterized cache for instant repeat queries.

User feeds – cache disabled or private to preserve personalization.

Environment requirements

Nginx 1.26.x / 1.27.x (stable or mainline)

Rocky Linux 9 or Ubuntu 24.04 LTS (examples use Rocky 9)

SSD strongly recommended for cache I/O

At least 8 GB RAM; hot cache can be mounted on

tmpfs

ngx_cache_purge

optional for active purge

Implementation steps

1. Preparation

Create cache directories

# Create main cache directory
mkdir -p /var/cache/nginx/proxy_cache

# Optional memory cache for hot data
mkdir -p /var/cache/nginx/memory_cache
mount -t tmpfs -o size=2G tmpfs /var/cache/nginx/memory_cache

# Persist tmpfs entry (optional)
echo "tmpfs /var/cache/nginx/memory_cache tmpfs size=2G,mode=0755 0 0" >> /etc/fstab

# Set ownership and permissions
chown -R nginx:nginx /var/cache/nginx
chmod -R 755 /var/cache/nginx

Check disk performance

# Test write speed (SSD >200 MB/s, HDD >50 MB/s)
dd if=/dev/zero of=/var/cache/nginx/test.bin bs=1M count=1024 conv=fdatasync
rm -f /var/cache/nginx/test.bin

Plan cache capacity

# Example: 50 KB average response × 100 k URLs ≈ 5 GB
# Recommended settings
# max_size slightly larger than expected usage, e.g. 8 GB
# inactive 1‑7 days depending on update frequency
# keys_zone 1 MB ≈ 8 000 keys

2. Core configuration

Define cache zones (add to the http block)

# /etc/nginx/nginx.conf (excerpt)
http {
    # General cache zones
    proxy_cache_path /var/cache/nginx/proxy_cache \
        levels=1:2 keys_zone=my_cache:100m max_size=10g inactive=7d use_temp_path=off;

    # Memory cache for hot data
    proxy_cache_path /var/cache/nginx/memory_cache \
        levels=1:2 keys_zone=hot_cache:50m max_size=2g inactive=1h use_temp_path=off;

    # API‑specific short‑lived cache
    proxy_cache_path /var/cache/nginx/api_cache \
        levels=1:2 keys_zone=api_cache:50m max_size=5g inactive=10m use_temp_path=off;

    # Global cache settings
    proxy_cache_lock on;
    proxy_cache_lock_timeout 5s;
    proxy_cache_use_stale error timeout updating http_500 http_502 http_503 http_504;
    proxy_cache_background_update on;
    proxy_cache_revalidate on;
}

Cache‑key design principles

# Include all factors that affect the response
proxy_cache_key "$host$uri$arg_page$arg_pageSize";

# Exclude volatile parameters (e.g., timestamps)
proxy_cache_key "$host$uri$arg_id$arg_type";

Cache rules (example server block)

# /etc/nginx/conf.d/cache.example.com.conf
proxy_cache_key "$scheme$host$request_uri";

proxy_cache_valid 200 301 302 1h;
proxy_cache_valid 404 1m;
proxy_cache_valid any 10s;

server {
    listen 80;
    server_name example.com;

    # Enable cache for the main location
    proxy_cache my_cache;
    proxy_cache_lock on;
    proxy_cache_min_uses 2;   # cache after 2 requests
    proxy_cache_use_stale error timeout updating;
    proxy_cache_background_update on;
    add_header X-Cache-Status $upstream_cache_status always;

    location / {
        proxy_pass http://backend;
    }

    # API with short TTL
    location /api/ {
        proxy_cache api_cache;
        proxy_cache_valid 200 1m;
        proxy_cache_bypass $http_authorization;
        proxy_no_cache $http_authorization;
        proxy_pass http://api_backend;
    }

    # Static assets – long TTL
    location ~* \.(jpg|jpeg|png|gif|css|js|woff2|woff|ttf)$ {
        proxy_cache my_cache;
        proxy_cache_valid 200 30d;
        expires 30d;
        proxy_pass http://static_backend;
    }

    # Cache purge endpoint (requires ngx_cache_purge)
    location ~ /purge(/.*) {
        allow 127.0.0.1;
        allow 10.0.0.0/8;
        deny all;
        proxy_cache_purge my_cache "$scheme$host$1";
    }
}

3. Startup and verification

Configuration check

# Test syntax
nginx -t
# Show cache‑related directives
nginx -T | grep -E "proxy_cache|proxy_cache_path"

Reload

# Graceful reload
nginx -s reload
# Verify cache directories exist
ls -la /var/cache/nginx/

Verify cache behavior

# First request – expect MISS
curl -I http://example.com/api/
# Subsequent request – expect HIT (after min_uses)
curl -I http://example.com/api/
# Observe X-Cache-Status header (MISS / HIT / EXPIRED / STALE)

Performance test

# Without cache
ab -n 1000 -c 50 -H "Cache-Control: no-cache" http://example.com/api/
# With cache
ab -n 1000 -c 50 http://example.com/api/
# Compare requests‑per‑second and average latency; cache‑hit QPS is typically 10‑100× higher.

Real‑world examples

News site

Home page updates frequently – cache for 1 minute with background update. Article pages rarely change – cache for 1 hour. Images are cached permanently.

E‑commerce

Product detail pages cached for 5 minutes. Real‑time price/stock API cached for 10 seconds. Search results cached for 5 minutes with min_uses 2 to avoid cache pollution.

API gateway

Cache only GET/HEAD requests; bypass when Authorization header is present. TTL can be derived from upstream Cache‑Control header.

Best practices & pitfalls

Cache‑key design

Include every variable that changes the response (e.g., pagination, sorting).

Exclude volatile parameters such as timestamps or cache‑busting query strings.

Keep the key length reasonable; Nginx stores an MD5 hash (32 chars).

Update strategies

TTL expiration with automatic background refresh.

Use proxy_cache_use_stale updating to serve stale content while a fresh copy is fetched.

Active purge via ngx_cache_purge for immediate invalidation.

Respect upstream Cache‑Control and ETag with proxy_cache_revalidate on.

Tiered cache architecture

Level 1: memory cache for hot data ( hot_cache).

Level 2: SSD cache for warm data.

Level 3: HDD for cold data.

CDN integration

When Nginx sits behind a CDN, set appropriate Cache‑Control and Vary headers so the CDN respects the same TTLs.

Common errors & fixes

Cache always MISS : backend sends Set‑Cookie or Cache‑Control: private. Add proxy_ignore_headers Set-Cookie or proxy_ignore_headers Cache-Control.

Cache content mixing : poorly designed key. Ensure the key includes all distinguishing parameters.

User data leakage : cache user‑specific endpoints; disable cache or include user ID in the key.

Cache directory fills up : tune max_size and inactive, or add a cleanup script.

Security considerations

Never cache responses that contain Authorization headers or sensitive cookies.

Restrict purge endpoint to trusted IPs.

Avoid caching pages that embed personal data.

Performance tuning

Enable sendfile, tcp_nopush, tcp_nodelay for static files.

Use aio threads and directio for large files.

Configure open_file_cache to reduce file‑open overhead.

Allocate sufficient keys_zone memory (e.g., 1 MB ≈ 8 000 keys).

Mount hot cache on tmpfs for ultra‑low latency.

Monitoring & alerting

Custom log format to capture cache status:

log_format cache_log '$remote_addr - [$time_local] "$request" $status $body_bytes_sent "$http_referor" "$http_user_agent" cache:$upstream_cache_status rt:$request_time upt:$upstream_response_time';

Parse logs to compute hit/miss ratios.

Expose metrics to Prometheus (e.g., nginx_cache_hit_total, nginx_cache_miss_total) and alert when hit rate < 70 %.

Backup, warm‑up & migration

Warm‑up script reads a list of URLs and pre‑populates the cache.

Cleanup script can purge all cache, old files, or trim to a target size.

When moving to a new server, rsync the cache directory but re‑warm because cache files contain host‑specific metadata.

Key takeaways

Define cache zones with appropriate levels, keys_zone, max_size, and inactive values.

Design cache keys that fully represent the request while staying concise.

Set TTLs per content type: static assets long, dynamic API short.

Use background update and stale‑serve to keep services available during backend failures.

Apply security filters to prevent caching of private or sensitive data.

Monitor cache health with logs, custom scripts, or Prometheus metrics.

Plan tiered storage (memory → SSD → HDD) for cost‑effective performance.