Operations 18 min read

How to Securely Run Business Containers as Non‑Root: Practical Docker & Kubernetes Techniques

This article shares practical experience on converting business containers to run without root privileges, covering the importance of non‑root execution, essential security concepts, Dockerfile USER settings, entrypoint scripts, handling machine‑UUID access, and concrete examples for CoreDNS, Consul, MySQL, Redis, etc.

Open Source Linux

Dec 19, 2023

How to Securely Run Business Containers as Non‑Root: Practical Docker & Kubernetes Techniques

This article summarizes practical experience of converting business containers to non‑root startup, emphasizing its importance and basic knowledge.

Origin

Customer security requirements mandate that business containers run without root, many needing ipset or iptables operations, which cannot be solved by pure rootless Docker. The goal is to make all business container processes non‑root.

Previous article "Container is fast but not safe, Rootless is the answer" introduced the risks of running Docker as root.

Transformation

Prerequisite Knowledge

Basic concepts include why using root is unsafe and examples of root risks.

Examples of Root Insecurity

Although Linux provides user namespaces, Docker does not support per‑container UID mapping like Podman, and containers can still modify mounted files, e.g., accidental rm -rf * deletions.

docker run --rm -v /mnt/sda1:/mnt/sda1 -it alpine
cp /mnt/sda1/somefile.tar.gz .
 tar xzvf somefile.tar.gz
cd somefile-v1.0
ls
# ...
cd ..
rm -rf *

Alpine's default workdir is /, so rm -rf /* would delete everything. Business containers must run processes with minimal privileges.

Choosing USER vs docker‑entrypoint.sh

Set USER in Dockerfile or use -u user:group at run time for simple processes (e.g., exporters). Examples include:

danielqsj/kafka_exporter

ClickHouse/clickhouse_exporter

kubernetes addonresizer

For containers with persistent data directories (e.g., MySQL, Redis), simply setting USER is insufficient; directory permissions must be adjusted before container start, matching UID/GID with the host.

Direct -v mount or Docker volume

K8s hostPath

Fixed PV

PVC under a StorageClass

Deploying on another K8s cluster

Modifying directory permissions ahead of time can break automation, especially when upstream images change UID/GID. Therefore, entrypoint scripts are preferred.

MySQL official image creates a dedicated mysql user with specific UID/GID and starts with ENTRYPOINT CMD (or K8s command / args). docker-entrypoint.sh mysqld Redis example entrypoint script (simplified):

#!/bin/sh
set -e
if [ "${1#-}" != "$1" ] || [ "${1%.conf}" != "$1" ]; then
  set -- redis-server "$@"
fi
if [ "$1" = 'redis-server' -a "$(id -u)" = '0' ]; then
  find . \! -user redis -exec chown redis '{}' +
  exec gosu redis "$0" "$@"
fi
# set appropriate umask
um="$(umask)"
if [ "$um" = '0022' ]; then
  umask 0077
fi
exec "$@"

Running docker top shows the container's PID with the host UID, and using gosu (or su‑exec) allows switching to a non‑root user while preserving signal handling.

Case Studies

Key practices:

Place PID and socket files under /tmp Grant write permission to /dev/std* if needed

Use fixed uid:gid for user creation to match host images

Avoid chmod -R 777 on directories

Machine‑UUID Handling

Reading the machine UUID via dmidecode -s system-uuid fails in containers without root. Instead, read /sys/firmware/dmi/tables/DMI after granting read permission, or use a Go library to parse SMBIOS data.

package main
import (
  "fmt"
  "log"
  "github.com/digitalocean/go-smbios/smbios"
)
func main() {
  rc, _, err := smbios.Stream()
  if err != nil { log.Fatalf("failed to open stream: %v", err) }
  defer rc.Close()
  d := smbios.NewDecoder(rc)
  ss, err := d.Decode()
  if err != nil { log.Fatalf("failed to decode structures: %v", err) }
  for _, s := range ss {
    if s.Header.Type == 1 {
      fmt.Printf("UUID: %X%X%X%X-%X%X-%X%X-%X%X-%X%X%X%X%X%X
",
        s.Formatted[7], s.Formatted[6], s.Formatted[5], s.Formatted[4],
        s.Formatted[9], s.Formatted[8], s.Formatted[11], s.Formatted[10],
        s.Formatted[12], s.Formatted[13], s.Formatted[14], s.Formatted[15],
        s.Formatted[16], s.Formatted[17], s.Formatted[18], s.Formatted[19])
    }
  }
}

Mount the host /sys/firmware/dmi/tables into the container and adjust permissions before invoking the Go binary.

CoreDNS

CoreDNS 1.11.0 supports non‑root, but the project uses 1.10.1. A custom image adds a non‑root user and sets the capability cap_net_bind_service to allow binding port 53.

ARG DEBIAN_IMAGE=debian:stable-slim
ARG BASE=gcr.io/distroless/static-debian12:nonroot
FROM coredns/coredns:1.10.1 as bin
FROM ${DEBIAN_IMAGE} AS build
SHELL ["/bin/sh", "-ec"]
RUN export DEBCONF_NONINTERACTIVE_SEEN=true DEBIAN_FRONTEND=noninteractive DEBIAN_PRIORITY=critical TERM=linux \
    && apt-get -qq update && apt-get -yyqq upgrade && apt-get -yyqq install ca-certificates libcap2-bin && apt-get clean
COPY --from=bin /coredns /coredns
RUN setcap cap_net_bind_service=+ep /coredns
FROM ${BASE}
COPY --from=build /etc/ssl/certs/ca-certificates.crt /etc/ssl/certs/
COPY --from=build /coredns /coredns
USER nonroot:nonroot
EXPOSE 53 53/udp
ENTRYPOINT ["/coredns"]

Building with BuildKit preserves the capability; otherwise the binary cannot bind to port 53.

Consul

The official Consul image runs as root; to avoid root, modify the entrypoint to use chown -R and drop dumb‑init so the PID 1 process runs as the non‑root user.

ARG VER=1.8.3
FROM consul:${VER}
RUN sed -ri -e 's/(chown)(\s+consul:)/\1 -R\2/' \
    -e '1s@/usr/bin/dumb-init\s+@@' /usr/local/bin/docker-entrypoint.sh

Docker Socket Access

Processes needing the host Docker socket (e.g., cAdvisor) must run as a user that belongs to the socket's group. The entrypoint script adds the user to the appropriate group and then execs the target binary via gosu or su‑exec.

#!/bin/sh
set -e
[ -z "$D_SOCK" ] && D_SOCK=/var/run/docker.sock
if [ "${1:0:1}" = '-' ]; then
  set -- cadvisor "$@"
fi
if [ "$1" = 'cadvisor' ]; then
  if [ "$(id -u)" = '0' -a -n "$RUN_USER" ]; then
    if [ -S "$D_SOCK" ]; then
      group_id=$(stat -c "%g" "$D_SOCK")
      getent group | cut -d: -f3 | grep -qw $group_id || addgroup -g $group_id docker
      group_name=$(stat -c "%G" "$D_SOCK")
      id -nG "$RUN_USER" | grep -qw $group_name || adduser $RUN_USER $group_name
    fi
    exec gosu $RUN_USER "$@"
  fi
fi
exec "$@"

Cron Replacement

Non‑root containers cannot use the traditional cron daemon; instead, go‑crond can be used.

References

V2EX discussion on dangerous rm -rf * GitHub PRs for kafka_exporter, clickhouse_exporter, addonresizer

MySQL Docker entrypoint script

gosu and su‑exec projects

Kernel source for DMI permission

Wurstmeister Kafka Docker

go‑crond project

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Non-root ENTRYPOINT gosu container-security

Written by

Open Source Linux

Focused on sharing Linux/Unix content, covering fundamentals, system development, network programming, automation/operations, cloud computing, and related professional knowledge.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.