Container Runtime Internals and Latest Trends (2023)
This article reviews container fundamentals, explains Docker’s architecture and Linux namespaces, cgroups, capabilities, and security mechanisms, then surveys recent developments such as containerd, CRI‑O, rootless containers, alternative runtimes like Podman, Kata, gVisor, and emerging WebAssembly‑based approaches, highlighting trends up to 2023.
The article introduces containers as lightweight isolation mechanisms for file systems, CPU, memory, and permissions, noting their efficiency compared to virtual machines and the lack of a strict definition.
Docker is presented as the most popular container engine, supporting Linux containers. A typical Docker run command is shown:
docker run -p 8080:80 -v .:/usr/share/nginx/html nginx:1.25The command maps host port 8080 to container port 80 and mounts the current directory into the container’s /usr/share/nginx/html directory, using the official nginx:1.25 image from Docker Hub.
Building custom images with a Dockerfile is demonstrated:
FROM debian:12
RUN apt-get update && apt-get install -y openjdk-17-jre
COPY myapp.jar /myapp.jar
CMD ["java", "-jar", "/myapp.jar"]Kubernetes is briefly described as a cluster manager that groups container hosts and provides load‑balancing and fault‑tolerance, interacting with Pods, Services, and other objects.
The history of containers is outlined, from early implementations such as FreeBSD Jail (1999) and Linux Virtuozzo (2000) to Docker (2013), highlighting the shift from monolithic VM‑like containers to single‑service, immutable containers.
Docker’s internal architecture consists of the docker CLI, the dockerd daemon, containerd , and the runc runtime, which leverages Linux kernel features like namespaces, cgroups, and capabilities. The article explains the role of each namespace (mount, network, PID, user, IPC, UTS, optional cgroup and time namespaces) and how they isolate resources.
Cgroups enforce resource quotas (CPU, memory, I/O, process count) and device access restrictions. Docker’s default configuration permits access to /dev/null , /dev/zero , and /dev/urandom while blocking devices such as /dev/sda and /dev/mem .
Linux capabilities are discussed, noting that Docker removes broad capabilities like CAP_SYS_ADMIN but retains others such as CAP_CHOWN , CAP_NET_BIND_SERVICE , and CAP_NET_RAW . Optional security layers like Seccomp, AppArmor, and SELinux are also covered.
Rootless containers place the runtime and containers inside a user namespace created by a non‑root user, reducing the impact of potential escapes. The evolution of rootless support—from LXC (2014) to Docker v20.10 (2020)—is summarized.
Recent trends include the migration away from Docker in Kubernetes (removed in v1.24, 2022) toward runtimes such as containerd and CRI‑O , with major cloud providers adopting these alternatives. Alternative CLI tools like Podman, nerdctl, and Lima are mentioned for local development on macOS/Windows.
Docker’s image subsystem is being modernized: Docker v24 (2023) adds experimental support for the containerd snapshotter via /etc/docker/daemon.json :
{"features":{"containerd-snapshotter": true}}Lazy‑pulling techniques (eStargz, SOCI, Nydus, OverlayBD) aim to reduce startup latency by fetching image layers on demand, with benchmark results showing up to a nine‑fold speedup.
Adoption of user namespaces is expected to grow as kernel support (e.g., id‑mapped mounts in v5.12) matures, and runc v1.2 will enable them by default.
Emerging security mechanisms such as Landlock (Linux v5.13) provide unprivileged, path‑based access control similar to AppArmor but without requiring root.
Alternative container runtimes like Kata Containers (VM‑based), gVisor (user‑mode kernel), and WebAssembly/WASI (runWASI plugin in containerd) are explored as potential successors or complements to traditional containers.
The article concludes with a summary of key points: containers are efficient but less secure than VMs, a growing ecosystem of Docker alternatives exists, and “non‑container” runtimes (Kata, gVisor, WebAssembly) represent notable trends.
System Architect Go
Programming, architecture, application development, message queues, middleware, databases, containerization, big data, image processing, machine learning, AI, personal growth.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.