Cloud Native 31 min read

Unlock Docker Caching: Layer Strategies, BuildKit & Cache Optimization

This guide explains Docker's layered architecture, how to leverage build cache, cache mounts, external cache solutions, multi‑stage builds, RUN instruction optimization, .dockerignore usage, cache busting, custom cache paths, BuildKit features, and Docker Compose layer caching, providing best‑practice tips and code examples for faster, smaller images.

Code Mala Tang

Dec 20, 2024

Unlock Docker Caching: Layer Strategies, BuildKit & Cache Optimization

1. Understanding Docker's Layered Architecture

Docker creates a separate read‑only layer for each instruction in a Dockerfile; layers are stacked to form the final image. Unchanged layers can be reused from previous builds, dramatically speeding up the build process.

How to Use Docker's Layered Architecture

Place the most stable and least‑changing instructions at the top of the Dockerfile so early layers remain unchanged and can be cached in future builds.

FROM python:3.8
RUN apt-get update && apt-get install -y build-essential libssl-dev
COPY . /app
RUN pip install -r /app/requirements.txt
CMD ["python", "/app/main.py"]

In this example only the application code (layer 3 onward) changes, allowing Docker to reuse layers 1 and 2.

When to Use Docker's Layered Architecture

Use it for images that change frequently but have stable base dependencies; it is especially beneficial in development environments and CI/CD pipelines where fast builds are critical.

Best Practices for Docker's Layered Architecture

Put rarely‑changing instructions at the beginning of the Dockerfile.

Combine related commands into a single RUN to minimize the number of layers.

Avoid copying unnecessary files into the build context.

Use multi‑stage builds to keep the final image small.

2. Leveraging Build Cache

Docker stores the results of each Dockerfile instruction in a cache. When rebuilding, Docker reuses cached results if the instruction’s inputs haven’t changed, reducing build time.

How to Use Build Cache

Arrange the Dockerfile so that stable instructions appear first, maximizing cache reuse.

FROM node:14
WORKDIR /app
COPY package.json package-lock.json ./
RUN npm install
COPY . .
RUN npm run build

If only the source code changes, Docker can reuse the cached RUN npm install layer.

When to Use Build Cache

Use it when parts of the image rarely change, such as dependencies, to speed up builds in development and CI/CD pipelines.

Best Practices for Build Cache

Place stable instructions early.

Pin dependency versions to improve cache hits.

Avoid using ADD or COPY with large or frequently changing directories.

Use multi‑stage builds to reduce the number of layers that need caching.

3. Using Cache Mounts

Cache mounts (a BuildKit feature) let you share cache data between build steps, avoiding redundant work for expensive operations like downloading dependencies or compiling code.

How to Use Cache Mounts

FROM node:14
WORKDIR /app
RUN --mount=type=cache target=/root/.npm \
    npm install
COPY . .
RUN npm run build

The npm cache is mounted at /root/.npm, allowing subsequent builds to reuse downloaded packages.

When to Use Cache Mounts

Use them for resource‑intensive steps such as dependency installation or compilation, especially in CI/CD pipelines.

Best Practices for Cache Mounts

Enable BuildKit by adding a syntax directive at the top of the Dockerfile.

Use cache mounts for steps that download or compile large dependencies.

Keep cache mount paths specific to the cache type to avoid conflicts.

Regularly clean and manage caches to prevent bloat.

4. External Cache Solutions

External caches store Docker layers in remote storage, making them reusable across different build environments or CI/CD pipelines.

How to Use External Caches

docker buildx create --use
docker buildx build --cache-to=type=local,dest=./my-cache --cache-from=type=local,src=./my-cache .
docker buildx build --cache-to=type=registry,ref=myrepo/myimage:cache --cache-from=type=registry,ref=myrepo/myimage:cache .

The first command uses a local directory; the second pushes/pulls cache to a remote registry.

When to Use External Caches

Ideal for CI/CD pipelines that run on multiple machines or for distributed teams needing shared cache layers.

Best Practices for External Caches

Select a reliable remote backend (e.g., S3, GCS, private registry).

Secure the cache with proper authentication.

Prune old cache data regularly to control storage costs.

Integrate cache push/pull steps into CI/CD workflows.

5. Multi‑Stage Builds

Multi‑stage builds use multiple FROM statements to separate build and runtime environments, producing smaller final images.

How to Use Multi‑Stage Builds

# Build stage
FROM golang:1.16 AS builder
WORKDIR /app
COPY . .
RUN go build -o myapp
# Final stage
FROM alpine:latest
COPY --from=builder /app/myapp /myapp
CMD ["/myapp"]

The builder stage compiles the Go app; the final stage copies only the binary into a lightweight Alpine image.

When to Use Multi‑Stage Builds

Use when you need to separate heavy build dependencies from the runtime image, reducing size and attack surface.

Best Practices for Multi‑Stage Builds

Keep the final stage minimal by only including runtime dependencies.

Name stages clearly for readability.

Combine related commands within each stage to minimize layers.

Regularly review Dockerfiles for optimization opportunities.

6. Optimizing RUN Instructions

Each RUN creates a new layer; merging commands reduces layer count and improves cache efficiency.

How to Optimize RUN

FROM ubuntu:20.04
RUN apt-get update && apt-get install -y \
    build-essential \
    curl \
    git && rm -rf /var/lib/apt/lists/*
COPY . /app
WORKDIR /app
RUN make
CMD ["./app"]

Multiple commands are combined into a single RUN, keeping the image small.

When to Optimize RUN

Use when you have several setup steps that can be combined, especially during initial environment configuration.

Best Practices for RUN

Combine related commands into a single RUN.

Clean temporary files within the same RUN to avoid layer bloat.

Use && to chain commands, ensuring each step succeeds.

Avoid mixing frequently changing commands with stable ones in the same RUN.

7. Caching Dependencies

Store downloaded dependencies in cache layers so they don’t need to be re‑downloaded on subsequent builds.

How to Cache Dependencies

FROM node:14
WORKDIR /app
# Copy only package files first
COPY package.json package-lock.json ./
RUN npm ci
# Then copy the rest of the source
COPY . .
RUN npm run build
CMD ["node", "dist/app.js"]

Only when package.json or package-lock.json changes does Docker reinstall dependencies.

When to Cache Dependencies

Beneficial for projects with heavy or frequently used dependencies, especially in CI/CD pipelines.

Best Practices for Dependency Caching

Separate copying of dependency manifests from source code.

Pin exact dependency versions.

Clean up unused files after installation.

Regularly update dependencies to avoid stale caches.

8. Using .dockerignore

The .dockerignore file excludes files and directories from the build context, similar to .gitignore, reducing context size and build time.

How to Use .dockerignore

node_modules
dist
.git
Dockerfile
.dockerignore

This excludes unnecessary directories and the Dockerfile itself from the context.

When to Use .dockerignore

Use whenever the project contains files that are not needed in the final image, such as local build artifacts or version‑control metadata.

Best Practices for .dockerignore

Include a .dockerignore in every project.

Review and update it regularly.

Use specific patterns to exclude only what’s unnecessary.

Test builds to ensure essential files aren’t accidentally ignored.

9. Cache Busting

Cache busting intentionally invalidates a cache layer so Docker re‑executes a step, useful for forcing updates or debugging.

How to Implement Cache Busting

FROM node:14
ARG CACHEBUST=1
RUN echo $CACHEBUST
WORKDIR /app
COPY package.json package-lock.json ./
RUN npm ci
COPY . .
RUN npm run build
CMD ["node", "dist/app.js"]

Changing the CACHEBUST argument forces Docker to rebuild the subsequent layers.

When to Use Cache Busting

Use when you need to ensure a step always runs with fresh data, such as pulling the latest dependencies.

Best Practices for Cache Busting

Target specific layers with build arguments rather than busting the entire Dockerfile.

Limit usage to critical steps to avoid unnecessary rebuild time.

Manage busting systematically in CI/CD pipelines.

Review and adjust busting strategies as build requirements evolve.

10. Automated Cache Management

Automate cache handling with tools like Docker BuildKit and CI/CD platforms (GitHub Actions, GitLab CI, Jenkins) to keep builds efficient without manual intervention.

Example with GitHub Actions

name: Build and Cache Docker
on: [push]
jobs:
  build:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v2
      - name: Set up Docker Buildx
        uses: docker/setup-buildx-action@v1
      - name: Cache Docker layers
        uses: actions/cache@v2
        with:
          path: /tmp/.buildx-cache
          key: ${{ runner.os }}-buildx-${{ github.sha }}
          restore-keys: |
            ${{ runner.os }}-buildx-
      - name: Build and push
        run: |
          docker buildx build --cache-to=type=local,dest=/tmp/.buildx-cache --push .

The workflow sets up BuildKit, caches layers locally, and reuses them in subsequent builds.

When to Automate Cache Management

Ideal for projects with frequent builds where manual cache handling would be error‑prone.

Best Practices for Automation

Integrate BuildKit into CI/CD pipelines.

Use cache keys and restore keys to store and retrieve layers correctly.

Monitor and prune caches regularly.

Version‑control CI/CD configuration for consistency.

11. Custom Cache Paths

Specify custom locations for cache data using --mount=type=cache to better manage and reuse caches for resource‑intensive steps.

How to Use Custom Cache Paths

FROM golang:1.16
WORKDIR /app
RUN --mount=type=cache,target=/root/.cache/go-build \
    go build -o myapp
COPY . .
RUN go install -v ./...
CMD ["myapp"]

The Go build cache is directed to /root/.cache/go-build for reuse.

When to Use Custom Cache Paths

Use when specific build steps generate large temporary data that should be shared across builds.

Best Practices for Custom Cache Paths

Define clear, unique cache paths to avoid conflicts.

Monitor cache size and clean up regularly.

Combine with other caching strategies (e.g., multi‑stage builds).

Document cache usage in the Dockerfile for team awareness.

12. Advanced Docker BuildKit Features

BuildKit improves performance, concurrency, and security, offering cache import/export, build secrets, and parallel builds.

How to Enable BuildKit

Set the environment variable DOCKER_BUILDKIT=1 and use BuildKit‑specific syntax such as cache mounts.

FROM node:14
WORKDIR /app
RUN --mount=type=cache,target=/root/.npm \
    npm install
COPY . .
RUN npm run build
CMD ["node", "dist/app.js"]

When to Use BuildKit

Use for large or complex applications where build speed and resource efficiency are critical, especially in CI/CD environments.

Best Practices for BuildKit

Enable it via DOCKER_BUILDKIT=1.

Leverage cache mounts, build secrets, and parallel builds.

Keep Dockerfiles up‑to‑date to exploit new features.

Integrate BuildKit into CI/CD pipelines.

13. Docker Compose Layer Caching

Configure layer caching in docker-compose.yml using cache_from and build arguments to speed up multi‑service builds.

version: '3.8'
services:
  app:
    build:
      context: .
      dockerfile: Dockerfile
      cache_from:
        - type=local,source=./my-cache
      args:
        - CACHEBUST=1
    volumes:
      - .:/app
  web:
    build:
      context: ./web
      dockerfile: Dockerfile
      cache_from:
        - type=local,source=./web-cache
      args:
        - CACHEBUST=1
    volumes:
      - ./web:/web

Both services use local caches and can bust caches via the CACHEBUST argument.

When to Use Compose Layer Caching

Beneficial for applications with multiple services that are built frequently, reducing overall build time.

Best Practices for Compose Caching

Define separate cache paths per service.

Control cache invalidation with build arguments.

Monitor cache directories to prevent excessive storage use.

Version‑control compose files to keep caching consistent across environments.

1. Understanding Docker's Layered Architecture

How to Use Docker's Layered Architecture

When to Use Docker's Layered Architecture

Best Practices for Docker's Layered Architecture

2. Leveraging Build Cache

How to Use Build Cache

When to Use Build Cache

Best Practices for Build Cache

3. Using Cache Mounts

How to Use Cache Mounts

When to Use Cache Mounts

Best Practices for Cache Mounts

4. External Cache Solutions

How to Use External Caches

When to Use External Caches

Best Practices for External Caches

5. Multi‑Stage Builds

How to Use Multi‑Stage Builds

When to Use Multi‑Stage Builds

Best Practices for Multi‑Stage Builds

6. Optimizing RUN Instructions

How to Optimize RUN

When to Optimize RUN

Best Practices for RUN

7. Caching Dependencies

How to Cache Dependencies

When to Cache Dependencies

Best Practices for Dependency Caching

8. Using .dockerignore

How to Use .dockerignore

When to Use .dockerignore

Best Practices for .dockerignore

9. Cache Busting

How to Implement Cache Busting

When to Use Cache Busting

Best Practices for Cache Busting

10. Automated Cache Management

Example with GitHub Actions

When to Automate Cache Management

Best Practices for Automation

11. Custom Cache Paths

How to Use Custom Cache Paths

When to Use Custom Cache Paths

Best Practices for Custom Cache Paths

12. Advanced Docker BuildKit Features

How to Enable BuildKit

When to Use BuildKit

Best Practices for BuildKit

13. Docker Compose Layer Caching

When to Use Compose Layer Caching

Best Practices for Compose Caching

Further Reading and Resources

Code Mala Tang

How this landed with the community

Was this worth your time?

0 Comments