How to Shrink Docker Images by 70% and Harden Them with Trivy
This guide explains how to dramatically shrink Docker image sizes by up to 70% using multi‑stage builds, Alpine or Distroless bases, layer merging, .dockerignore, and BuildKit, while also integrating Trivy security scanning, non‑root users, SUID removal, and CI/CD automation to ensure a lean, secure container deployment.
Applicable Scenarios & Prerequisites
Target container images larger than 500 MB, build time >10 min, with high‑severity CVEs. Supported OS: RHEL/CentOS 7.9+ or Ubuntu 20.04+. Requires Docker 20.10+ (or Podman 3.0+), Trivy 0.40+.
Anti‑Pattern Warnings
Debug environments need full toolchains; avoid over‑optimising.
Legacy apps that depend on specific glibc versions may break on Alpine (musl).
Extreme performance requirements can suffer 5‑10 % slowdown on Alpine.
Compliance that requires audit logs may be broken by removing log files.
Teams unfamiliar with Alpine should consider training or use Debian‑Slim.
Environment & Version Matrix
OS version : RHEL 8.7+ / CentOS Stream 9 or Ubuntu 22.04 LTS
Kernel version : 4.18.0‑425+ (RHEL) or 5.15.0‑60+ (Ubuntu)
Docker : 24.0.7 (official repo)
Podman : 4.6.1 (RHEL) or 4.3.1 (apt)
Trivy : 0.48.3
Quick Checklist
Check current image size: docker images | grep your-image Backup Dockerfile: cp Dockerfile Dockerfile.bak Install Trivy (e.g., brew install aquasecurity/trivy/trivy or script)
Enable Docker BuildKit:
export DOCKER_BUILDKIT=1Implementation Steps
Architecture Overview
A Docker image consists of a base layer, dependency layers, application layer, and configuration layer. Each Dockerfile instruction ( RUN, COPY, ADD) creates a new read‑only layer; a writable top layer is added at runtime.
Image = Base Layer + Dependency Layers + Application Layer + Config LayerOptimization Strategies
Replace the base image (e.g., ubuntu:22.04 → alpine:3.18 → distroless → scratch).
Use multi‑stage builds to separate build‑time dependencies from runtime.
Merge multiple RUN commands and clean caches in the same layer.
Use a .dockerignore file to exclude source control, documentation, tests, and build artifacts.
Enable BuildKit for parallel builds and better caching.
Security hardening: run as a non‑root user, remove SUID binaries, set a read‑only root filesystem, and enable Docker Content Trust.
Sample Optimized Dockerfile (Python)
# ---------- Stage 1: Builder ----------
FROM python:3.11-alpine3.18 AS builder
WORKDIR /app
# Install build dependencies and create a virtual environment
RUN apk add --no-cache gcc musl-dev postgresql-dev libffi-dev && \
python -m venv /opt/venv && \
/opt/venv/bin/pip install --no-cache-dir -r requirements.txt
# ---------- Stage 2: Runtime ----------
FROM python:3.11-alpine3.18
WORKDIR /app
# Install only runtime dependencies
RUN apk add --no-cache libpq libffi && rm -rf /var/cache/apk/*
COPY --from=builder /opt/venv /opt/venv
COPY --chown=appuser:appuser . .
# Create a non‑root user
RUN addgroup -S appuser && adduser -S appuser -G appuser
USER appuser
EXPOSE 8000
HEALTHCHECK --interval=30s --timeout=3s \
CMD wget --quiet --tries=1 --spider http://localhost:8000/health || exit 1
CMD ["python","app.py"]Build and Compare
# Original image (Ubuntu)
docker build -t myapp:ubuntu .
# Optimized image (Alpine)
docker build -t myapp:alpine .
# Compare sizes
docker images | grep myapp
# Example output:
# myapp:ubuntu 1.2GB → myapp:alpine 350MB → myapp:multistage 120MBSecurity Scanning with Trivy
# Scan the optimized image
trivy image --severity HIGH,CRITICAL myapp:alpine
# After fixing vulnerabilities
trivy image --severity HIGH,CRITICAL myapp:secureFix by upgrading the base image (e.g., python:3.11-alpine3.19) or updating vulnerable packages in requirements.txt.
CI/CD Integration
GitLab CI example:
stages:
- build
- scan
- deploy
build:
stage: build
script:
- docker build -t $CI_REGISTRY_IMAGE:$CI_COMMIT_SHA .
- docker push $CI_REGISTRY_IMAGE:$CI_COMMIT_SHA
security_scan:
stage: scan
image: aquasec/trivy:latest
script:
- trivy image --exit-code 1 --severity HIGH,CRITICAL $CI_REGISTRY_IMAGE:$CI_COMMIT_SHA
allow_failure: false
deploy:
stage: deploy
script:
- kubectl set image deployment/myapp myapp=$CI_REGISTRY_IMAGE:$CI_COMMIT_SHA
only:
- mainGitHub Actions example (using aquasecurity/trivy-action) can upload SARIF results to GitHub Security.
Monitoring Image Size
Export image sizes for Prometheus node‑exporter:
# /opt/monitoring/docker_image_size.sh
TEXTFILE="/var/lib/node_exporter/textfile_collector/docker_images.prom"
docker images --format "{{.Repository}}:{{.Tag}} {{.Size}}" | while read line; do
IMAGE=$(echo $line | awk '{print $1}')
SIZE=$(echo $line | awk '{print $2}' | numfmt --from=iec)
echo "docker_image_size_bytes{image=\"$IMAGE\"} $SIZE"
done > $TEXTFILECommon Issues & Troubleshooting
Alpine container fails at runtime – usually due to musl vs glibc incompatibility. Switch to Debian‑Slim or rebuild dependencies for musl.
Image size not reduced – check for unmerged RUN layers; combine commands and clean caches in a single layer.
Build cache invalidated – ensure COPY of source files occurs after installing dependencies.
Permission errors in container – verify file ownership; use COPY --chown=appuser:appuser and run as non‑root.
Trivy scan timeout – update the vulnerability database ( trivy image --download-db-only) or use a local mirror.
FAQ
Why does Alpine cause compatibility issues? Alpine uses musl libc; some compiled binaries expect glibc. Use Debian‑Slim or rebuild dependencies for musl.
Can I debug a multi‑stage image? The runtime image has no tools; keep a debug stage or use docker exec -u root / kubectl debug.
How to identify removable files? Use dive myapp:latest to inspect layer contents.
Distroless images have no shell – how to troubleshoot? Attach a temporary debug container with busybox or use kubectl debug.
Can Trivy miss vulnerabilities? It relies on CVE databases that may lag; consider additional scanners like Clair or Anchore.
Is a minimal image mandatory for production? Not required; Debian‑Slim (~100 MB) offers a good balance between size and debuggability.
How to handle npm/yarn caches? Run
npm ci --only=production && npm cache clean --force && rm -rf /root/.npm /tmp/*.
How to build multi‑arch images? Use Docker Buildx:
docker buildx create --use && docker buildx build --platform linux/amd64,linux/arm64 -t myapp:latest --push ..
Is image signing required? Not mandatory but strongly recommended for supply‑chain security.
How to automate image optimization? Tools like docker-slim can analyze and produce a smaller image automatically.
One‑Click Optimization Script
#!/bin/bash
set -e
IMAGE_NAME=${1:-myapp}
IMAGE_TAG=${2:-latest}
# Backup original Dockerfile
cp Dockerfile Dockerfile.bak.$(date +%Y%m%d_%H%M%S)
# Enable BuildKit
export DOCKER_BUILDKIT=1
# Build optimized image
time docker build -t ${IMAGE_NAME}:${IMAGE_TAG} .
# Show image size and layer history
docker images ${IMAGE_NAME}:${IMAGE_TAG}
docker history ${IMAGE_NAME}:${IMAGE_TAG} --no-trunc
# Security scan (if Trivy is installed)
if command -v trivy >/dev/null 2>&1; then
trivy image --severity HIGH,CRITICAL ${IMAGE_NAME}:${IMAGE_TAG}
else
echo "Trivy not installed – skipping scan"
fi
echo "Optimized image size: $(docker images ${IMAGE_NAME}:${IMAGE_TAG} --format "{{.Size}}")"Run with
chmod +x optimize_dockerfile.sh && ./optimize_dockerfile.sh myapp v2.0.
References
Docker best practices: https://docs.docker.com/develop/dev-best-practices/
Dockerfile reference: https://docs.docker.com/engine/reference/builder/
Trivy documentation: https://aquasecurity.github.io/trivy/
Hadolint linter: https://github.com/hadolint/hadolint
Distroless images: https://github.com/GoogleContainerTools/distroless
Docker Slim: https://github.com/docker-slim/docker-slim
Raymond Ops
Linux ops automation, cloud-native, Kubernetes, SRE, DevOps, Python, Golang and related tech discussions.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
