Cloud Native 7 min read

Persisting Docker Layer Cache in GitLab CI on Kubernetes with Docker‑in‑Docker

This guide explains how to persist Docker layer caches in GitLab CI pipelines on Kubernetes by deploying a standalone Docker‑in‑Docker service with a Local Persistent Volume, configuring a Service for access, updating CI jobs, and adding a CronJob to prune old images.

DevOps Cloud Academy
DevOps Cloud Academy
DevOps Cloud Academy
Persisting Docker Layer Cache in GitLab CI on Kubernetes with Docker‑in‑Docker

Previously we used the Docker‑on‑Docker approach, mounting the host docker.sock into the build container. After upgrading the Kubernetes cluster to version 1.22.x, the container runtime switched from Docker to Containerd, so the Docker socket is no longer available.

To keep using Docker for image builds we adopt the Docker‑in‑Docker (DIND) pattern. In a typical GitLab CI job a pod with three containers is created, one of which runs a DIND sidecar. Because each pod starts a fresh Docker daemon, no image layers are cached, leading to longer build times.

The solution is to run a single, independent DIND service and let all build containers connect to that daemon, allowing the Docker layer cache to be persisted.

First, create a PersistentVolumeClaim backed by a Local PV to store Docker’s data:

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: local-volume
provisioner: kubernetes.io/no-provisioner
reclaimPolicy: Delete
volumeBindingMode: WaitForFirstConsumer
---
apiVersion: v1
kind: PersistentVolume
metadata:
  name: docker-pv
spec:
  capacity:
    storage: 5Gi
  accessModes:
  - ReadWriteOnce
  persistentVolumeReclaimPolicy: Retain
  storageClassName: local-volume
  local:
    path: /mnt/k8s/docker  # data storage directory
  nodeAffinity:
    required:
      nodeSelectorTerms:
      - matchExpressions:
        - key: kubernetes.io/hostname
          operator: In
          values:
          - node1  # runs on node1
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  labels:
    app: docker-dind
  name: docker-dind-data
  namespace: kube-ops
spec:
  accessModes:
  - ReadWriteOnce
  storageClassName: local-volume
  resources:
    requests:
      storage: 5Gi

Next, deploy the DIND service with a Deployment that mounts the PVC and runs the Docker daemon in privileged mode:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: docker-dind
  namespace: kube-ops
  labels:
    app: docker-dind
spec:
  selector:
    matchLabels:
      app: docker-dind
  template:
    metadata:
      labels:
        app: docker-dind
    spec:
      containers:
      - image: docker:dind
        name: docker-dind
        args:
        - --registry-mirror=https://ot2k4d59.mirror.aliyuncs.com/  # image accelerator
        env:
        - name: DOCKER_DRIVER
          value: overlay2
        - name: DOCKER_HOST
          value: tcp://0.0.0.0:2375
        - name: DOCKER_TLS_CERTDIR   # disable TLS
          value: ""
        volumeMounts:
        - name: docker-dind-data-vol  # persist Docker root directory
          mountPath: /var/lib/docker/
        ports:
        - name: daemon-port
          containerPort: 2375
        securityContext:
          privileged: true  # required privileged mode
      volumes:
      - name: docker-dind-data-vol
        persistentVolumeClaim:
          claimName: docker-dind-data

Create a Service so that build jobs can reach the daemon via a stable DNS name:

apiVersion: v1
kind: Service
metadata:
  name: docker-dind
  namespace: kube-ops
  labels:
    app: docker-dind
spec:
  ports:
  - port: 2375
    targetPort: 2375
  selector:
    app: docker-dind

In the GitLab CI pipeline, point the Docker client to the DIND service and run the usual build and push steps:

stages:
  - image

build_image:
  stage: image
  image: docker:latest
  variables:
    DOCKER_HOST: tcp://docker-dind:2375  # connect via service DNS
  script:
    - docker info
    - docker build -t xxxx .
    - docker push xxxx
  only:
    - tags

Because the Docker layers are now cached on the PVC, subsequent builds are significantly faster. To prevent the cache from growing indefinitely, a CronJob is added to prune unused images on a weekly basis:

apiVersion: batch/v1
kind: CronJob
metadata:
  name: docker-dind-clear-cache
  namespace: kube-ops
spec:
  schedule: 0 0 * * 0  # every week
  jobTemplate:
    metadata:
      labels:
        app: docker-dind
      name: docker-dind-clear-cache
    spec:
      template:
        spec:
          restartPolicy: OnFailure
          containers:
          - name: clear-cache
            image: docker:latest
            command:
            - docker
            - system
            - prune
            - -af
            env:
            - name: DOCKER_HOST
              value: tcp://docker-dind:2375

With the persistent Docker daemon and regular cache cleanup, the CI/CD workflow becomes both faster and more maintainable.

DockercacheKubernetesGitLab CIPersistent VolumeDocker-in-Docker
DevOps Cloud Academy
Written by

DevOps Cloud Academy

Exploring industry DevOps practices and technical expertise.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.