Master Docker: From Basics to Advanced Core Principles Explained
This comprehensive guide walks ops engineers through Docker’s core concepts—images, containers, storage drivers, networking, security, image building, multi‑stage builds, volume management, resource limits, troubleshooting, and production deployment best practices—providing step‑by‑step commands, examples, and detailed explanations to master containerization from beginner to expert.
Core Concepts
Docker images are read‑only layered templates. Each instruction in a Dockerfile ( FROM, RUN, COPY, ADD, etc.) creates a new layer. Common base images include ubuntu, alpine, nginx, mysql, redis, python, and node. The storage driver (default overlay2) manages how these layers are stored on disk; overlay2 is recommended for production because of its performance and simplicity.
Architecture
Docker follows a client‑server model. The daemon dockerd implements image management, container lifecycle, networking, and storage. It listens on the Unix socket /var/run/docker.sock (or on TCP ports 2375/2376 for remote access). The Docker CLI communicates with the daemon via this socket or the DOCKER_HOST environment variable.
Container Runtime
Since Docker 1.11 the runtime is split: containerd handles container lifecycle, while runc creates the container process using Linux namespaces and cgroups. This split allows other orchestrators (e.g., Kubernetes) to talk directly to containerd via the CRI.
Copy‑on‑Write (CoW)
When a container writes to a file, Docker copies the file from the read‑only layer to the writable layer (CoW). The first write incurs a copy cost; overlay2 optimises this, but extreme write‑heavy workloads still need attention.
Image Management
Typical Dockerfile:
FROM ubuntu:22.04
LABEL maintainer="[email protected]"
RUN apt-get update && \
apt-get install -y nginx curl && \
apt-get clean && \
rm -rf /var/lib/apt/lists/*
COPY nginx.conf /etc/nginx/nginx.conf
WORKDIR /usr/share/nginx/html
EXPOSE 80
CMD ["nginx","-g","daemon off;"]Key points:
Each RUN, COPY, ADD creates a new layer.
Combine multiple commands in a single RUN to reduce layer count.
Multi‑stage Builds
Example for a Go binary (final image ~20‑30 MB):
# Builder stage
FROM golang:1.21-alpine AS builder
WORKDIR /app
COPY . .
RUN CGO_ENABLED=0 GOOS=linux go build -o myapp .
# Runtime stage
FROM alpine:latest
COPY --from=builder /app/myapp /myapp
CMD ["/myapp"]Example for a Java/Maven project:
# Builder stage
FROM maven:3.9-eclipse-temurin-17 AS builder
WORKDIR /app
COPY pom.xml .
RUN mvn dependency:go-offline
COPY src ./src
RUN mvn package -DskipTests
# Runtime stage
FROM eclipse-temurin:17-jre-alpine
WORKDIR /app
COPY --from=builder /app/target/myapp.jar .
ENTRYPOINT ["java","-jar","myapp.jar"].dockerignore
Exclude unnecessary files from the build context to speed up builds and avoid large contexts:
# .dockerignore
.git
.gitignore
node_modules
*.log
.env
.env.*
dist
.DS_Store
*.md
README*Best‑practice Checklist
Combine RUN commands to minimise layers.
Prefer lightweight base images (e.g., alpine ≈ 5 MB) unless glibc is required.
Place infrequently‑changed instructions (package installation) before copying source code to maximise cache reuse.
Container Lifecycle
Common commands:
# Create and start a container (pulls image if missing)
docker run -d --name my_nginx -p 8080:80 nginx:1.25
# Stop (SIGTERM → SIGKILL after 10 s)
docker stop my_nginx
# Start a stopped container
docker start my_nginx
# Restart
docker restart my_nginx
# Remove (must stop first)
docker rm my_nginx
# Force‑remove a running container
docker rm -f my_nginx
# List running containers
docker ps
# List all containers (including stopped)
docker ps -a
# Inspect details
docker inspect my_nginx
# Real‑time resource usage
docker stats
# View logs (default json‑file driver)
docker logs -f --tail 100 my_nginxRestart Policies
Four policies control automatic restarts: no – never restart (default). always – always restart. unless-stopped – restart after daemon start unless the container was stopped manually. on-failure[:max-retries] – restart only on non‑zero exit codes, optional retry limit.
Compose example:
services:
nginx:
image: nginx:1.25
restart: unless-stopped
ports:
- "80:80"ENTRYPOINT vs CMD
When both are present, CMD provides default arguments to ENTRYPOINT. Use the exec form (JSON array) to preserve signal handling:
ENTRYPOINT ["python","app.py"]
CMD ["--port","8080"]Running docker run myapp --port 9090 overrides the CMD arguments while keeping the same ENTRYPOINT.
Logging
Default logs are stored at
/var/lib/docker/containers/<container-id>/<container-id>-json.log. The default driver is json-file. Configure rotation to avoid disk exhaustion:
{
"log-driver": "json-file",
"log-opts": {
"max-size": "10m",
"max-file": "3"
}
}Apply at runtime with
--log-driver json-file --log-opt max-size=10m --log-opt max-file=3or switch to journald / syslog for high‑volume services.
Networking
Docker provides five network modes:
bridge (default) – creates virtual bridge docker0 with NAT.
host – container shares the host network namespace (no isolation).
overlay – virtual network across multiple Docker daemons (used by Swarm).
macvlan – assigns a real MAC address, making the container appear as a physical host on the LAN.
none – disables networking.
Port mapping ( -p host:container) adds two iptables NAT rules: DNAT from the host port to the container port and SNAT (in bridge mode) so that return traffic appears to come from the host.
# View Docker‑added NAT rules
iptables -t nat -L -n -v | grep DOCKER
# Verify mapping for a container
docker port my_nginxNetwork Troubleshooting
List networks: docker network ls and inspect with docker network inspect <name>.
Inspect a container’s network settings:
docker inspect <container> --format '{{.NetworkSettings.Networks}}'.
Test connectivity from inside the container:
docker exec -it my_nginx ping -c 3 8.8.8.8
docker exec -it my_nginx nslookup google.comCheck DNS configuration: cat /etc/resolv.conf inside the container; override with --dns if needed.
Verify port mapping on the host: curl http://localhost:8080.
Ensure IP forwarding is enabled on the host: cat /proc/sys/net/ipv4/ip_forward (should be 1).
Common network failure scenarios:
Container cannot reach the external network – check DNS servers and iptables NAT rules.
Port mapping does not work – verify that the host port is free and that firewall rules allow traffic.
Cross‑host communication fails – requires an overlay network or a third‑party plugin (e.g., Calico, Flannel) and open ports 2377/4789.
Storage and Volumes
Three volume types:
Named volume (managed by Docker):
docker volume create my_data
docker run -d -v my_data:/var/lib/mysql mysql:8.0Stored under /var/lib/docker/volumes/<name>/_data and persists beyond container lifetimes.
Bind mount (host directory):
docker run -d -v /data/mysql:/var/lib/mysql mysql:8.0Useful for direct host access but ties the container to a specific host path.
tmpfs (in‑memory):
docker run -d --tmpfs /tmp:rw,size=512m,mode=1777 myappData disappears on container restart; suited for transient or sensitive data.
Backup a named volume:
docker run --rm -v mysql_data:/data -v /backup:/backup alpine \
tar czf /backup/mysql_data.tar.gz -C /data .Volume management commands:
# List all volumes
docker volume ls
# Inspect a volume
docker volume inspect mysql_data
# Remove an unused volume (high risk – data loss)
docker volume rm mysql_data
# Prune dangling volumes
docker volume pruneSecurity Hardening
Run containers as a non‑root user:
# Dockerfile
RUN useradd -m -s /bin/bash appuser
USER appuserOr at runtime: docker run --user 1000:1000 … Drop unnecessary Linux capabilities and add only required ones:
# Remove all capabilities (strictest)
docker run --cap-drop all …
# Add a specific capability (e.g., NET_ADMIN)
docker run --cap-add NET_ADMIN …Set hard resource limits to prevent DoS:
# Memory limit (no swap)
docker run --memory=512m --memory-swap=512m …
# CPU limit
docker run --cpus=1.5 …
# I/O throttling (example for /dev/sda)
docker run --device-read-bps=/dev/sda:10mb --device-write-bps=/dev/sda:10mb …Scan images for known vulnerabilities (CI/CD integration):
docker run --rm -v /var/run/docker.sock:/var/run/docker.sock aquasec/trivy image nginx:1.25Enable Docker Content Trust for signed images:
export DOCKER_CONTENT_TRUST=1
docker trust sign myapp:1.0Common Fault Diagnosis
Container start failure : check docker logs, exit code ( docker inspect --format '{{.State.ExitCode}}'), missing files, port conflicts, or OOM events ( dmesg | grep -i killed).
Restart loops : inspect healthcheck status ( docker inspect --format '{{.State.Health.FailingStreak}}') and exit codes. Common causes are failing healthchecks, missing dependencies, or configuration errors.
Disk space exhaustion :
# Overview
docker system df
# Detailed view
docker system df -v
# Clean up unused resources
docker system prune -a --volumesEnsure backups before pruning.
DNS resolution failure : verify /etc/resolv.conf inside the container, test with nslookup, and use --dns 8.8.8.8 if the host DNS is unreliable.
Performance bottlenecks : monitor with docker stats, check cgroup I/O stats (
/sys/fs/cgroup/blkio/docker/<id>/blkio.throttle.io_service_bytes), and verify file descriptor limits ( ulimit -n inside the container).
Docker Compose Advanced Usage
Typical docker-compose.yml for a web stack:
version: "3.9"
services:
web:
image: nginx:1.25-alpine
ports:
- "80:80"
volumes:
- ./html:/usr/share/nginx/html:ro
- ./nginx.conf:/etc/nginx/nginx.conf:ro
depends_on:
- api
networks:
- frontend
restart: unless-stopped
deploy:
resources:
limits:
cpus: "1.0"
memory: 512M
api:
image: python:3.11-slim
command: ["python","app.py"]
env_file: .env
depends_on:
redis:
condition: service_healthy
networks:
- frontend
- backend
restart: unless-stopped
redis:
image: redis:7-alpine
command: ["redis-server","--requirepass","${REDIS_PASSWORD}"]
volumes:
- redis_data:/data
networks:
- backend
healthcheck:
test: ["CMD","redis-cli","ping"]
interval: 5s
timeout: 3s
retries: 3
restart: unless-stopped
networks:
frontend:
driver: bridge
backend:
driver: bridge
volumes:
redis_data:Store secrets in a .env file (excluded from VCS) and reference them with env_file or ${VAR} syntax.
Production Deployment Checklist
Tag images with immutable versions (avoid :latest).
Push to a private registry over TLS.
Run vulnerability scans (e.g., Trivy) and enforce a policy that blocks images with high‑severity CVEs.
Backup persistent volumes before upgrades.
Deploy new version alongside the old one, verify health, then switch traffic.
Record deployment timestamps for audit.
echo "$(date '+%Y-%m-%d %H:%M:%S') - Deployed myapp:2.0.0" >> /var/log/deploy.logRollback by redeploying the previous tag or restoring the volume backup.
# Simple rollback with Docker Compose
docker-compose down
docker pull registry.example.com/myapp:1.0.0
docker-compose up -dSigned-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Ops Community
A leading IT operations community where professionals share and grow together.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
