How to Secure Containers: Complete Guide to Image Scanning and Zero‑Trust Runtime Protection

This comprehensive guide walks you through securing production container environments by covering image vulnerability scanning with Trivy, image signing with Cosign, Kubernetes hardening with SecurityContext and Pod Security Admission, runtime protection using Falco, network isolation via NetworkPolicy, and continuous monitoring with Prometheus, complete with scripts, CI/CD integration, and troubleshooting tips.

Raymond Ops
Raymond Ops
Raymond Ops
How to Secure Containers: Complete Guide to Image Scanning and Zero‑Trust Runtime Protection

Scope and Prerequisites

Target environments: production containers, CI/CD pipelines, image supply‑chain and runtime protection. Requires RHEL 8+/Ubuntu 20.04+ with kernel ≥ 5.4, Docker 20.10+/containerd 1.6+/CRI‑O 1.24+, Kubernetes 1.25+ (Pod Security Admission), root or admin privileges, network access to NVD/GitHub Advisory, and the tools Trivy 0.48+, Cosign 2.0+, optional Falco 0.36+.

Tool Versions

Trivy ≥ 0.48 (DB auto‑update)

Grype ≥ 0.74 (Syft ≥ 0.100)

Cosign ≥ 2.0 (requires Rekor)

Falco ≥ 0.36 (kernel module or eBPF)

Harbor ≥ 2.8 (integrated Trivy)

Implementation Overview

1. Install and configure Trivy

RHEL/CentOS:

# Install Trivy
sudo rpm -ivh https://github.com/aquasecurity/trivy/releases/download/v0.48.3/trivy_0.48.3_Linux-64bit.rpm

# Verify
trivy --version

# Update vulnerability database (offline mode optional)
trivy image --download-db-only
export TRIVY_CACHE_DIR=/opt/trivy-db   # if using custom cache

Ubuntu/Debian:

wget https://github.com/aquasecurity/trivy/releases/download/v0.48.3/trivy_0.48.3_Linux-64bit.deb
sudo dpkg -i trivy_0.48.3_Linux-64bit.deb
trivy image --download-db-only --cache-dir /opt/trivy-db

Typical scan commands:

# Scan an image and show only critical/high
trivy image --severity CRITICAL,HIGH nginx:latest

# JSON output for CI
trivy image -f json -o nginx-scan.json nginx:latest

2. Dockerfile security baseline

Use multi‑stage build and a distroless or minimal base, run as a non‑root user.

# Build stage
FROM golang:1.21-alpine AS builder
WORKDIR /app
COPY go.mod go.sum ./
RUN go mod download
COPY . .
RUN CGO_ENABLED=0 GOOS=linux go build -a -ldflags '-extldflags "-static"' -o myapp

# Runtime stage (distroless)
FROM gcr.io/distroless/static-debian12:nonroot
USER nonroot:nonroot
WORKDIR /app
COPY --from=builder --chown=nonroot:nonroot /app/myapp .
EXPOSE 8080
ENTRYPOINT ["/app/myapp"]

Verification:

# Build image
docker build -t myapp:secure .

# Scan for high/critical issues
trivy image --severity HIGH,CRITICAL myapp:secure

# Confirm non‑root user
docker inspect myapp:secure | jq '.[0].Config.User'
# Expected: "nonroot"

3. CI/CD integration (GitLab CI example)

stages:
  - build
  - scan
  - sign
  - deploy

variables:
  IMAGE_NAME: myapp
  IMAGE_TAG: $CI_COMMIT_SHORT_SHA
  REGISTRY: registry.example.com

build:
  stage: build
  image: docker:24-dind
  script:
    - docker build -t $REGISTRY/$IMAGE_NAME:$IMAGE_TAG .
    - docker push $REGISTRY/$IMAGE_NAME:$IMAGE_TAG

security-scan:
  stage: scan
  image: aquasec/trivy:latest
  script:
    - trivy image --exit-code 1 --severity CRITICAL $REGISTRY/$IMAGE_NAME:$IMAGE_TAG
    - trivy image --severity HIGH,CRITICAL --format json --output scan-report.json $REGISTRY/$IMAGE_NAME:$IMAGE_TAG
  artifacts:
    reports:
      container_scanning: scan-report.json
    expire_in: 30 days
    allow_failure: false

sign-image:
  stage: sign
  image: gcr.io/projectsigstore/cosign:v2.2
  script:
    - cosign sign --key cosign.key $REGISTRY/$IMAGE_NAME:$IMAGE_TAG
  only:
    - main

Key CI settings: --exit-code 1 aborts the pipeline on critical findings.

Artifacts store the JSON report for later audit.

Signing runs only on the main branch.

4. Image signing with Cosign

# Generate a key pair (store private key in CI secrets)
cosign generate-key-pair

# Sign an image
cosign sign --key cosign.key registry.example.com/myapp:v1.0

# Verify the signature
cosign verify --key cosign.pub registry.example.com/myapp:v1.0

Kubernetes admission control (OPA Gatekeeper + Sigstore Policy Controller) can enforce that only signed images are allowed. Example ConstraintTemplate and ClusterImagePolicy are omitted for brevity.

5. Kubernetes SecurityContext

apiVersion: v1
kind: Pod
metadata:
  name: secure-app
spec:
  securityContext:
    runAsNonRoot: true
    runAsUser: 10000
    fsGroup: 10000
    seccompProfile:
      type: RuntimeDefault
  containers:
  - name: app
    image: registry.example.com/myapp:v1.0
    securityContext:
      allowPrivilegeEscalation: false
      readOnlyRootFilesystem: true
      capabilities:
        drop: ["ALL"]
        add: ["NET_BIND_SERVICE"]

Verification commands:

# Check running user
kubectl exec secure-app -- id

# Attempt to write to the root filesystem (should fail)
kubectl exec secure-app -- touch /test.txt

# List capabilities
kubectl exec secure-app -- capsh --print

6. NetworkPolicy – default deny and whitelist

# Default deny all traffic in the namespace
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: default-deny-all
  namespace: production
spec:
  podSelector: {}
  policyTypes:
  - Ingress
  - Egress
# Example egress whitelist for a pod labeled app=myapp
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: allow-app-egress
  namespace: production
spec:
  podSelector:
    matchLabels:
      app: myapp
  policyTypes:
  - Egress
  egress:
  - to:
    - podSelector:
        matchLabels:
          app: mysql
    ports:
    - protocol: TCP
      port: 3306
  - to:
    - namespaceSelector:
        matchLabels:
          name: kube-system
      podSelector:
        matchLabels:
          k8s-app: kube-dns
    ports:
    - protocol: UDP
      port: 53
  - to:
    - ipBlock:
        cidr: 203.0.113.0/24
    ports:
    - protocol: TCP
      port: 443

7. Pod Security Admission (Restricted)

# Enable Restricted PSA at the namespace level
kubectl label namespace production \
  pod-security.kubernetes.io/enforce=restricted \
  pod-security.kubernetes.io/audit=restricted \
  pod-security.kubernetes.io/warn=restricted

The Restricted profile blocks privileged containers, host namespaces, host ports, running as root, writable root filesystem without explicit volume, and any capabilities not explicitly added.

8. Runtime protection with Falco

Install Falco via Helm in eBPF mode:

helm repo add falcosecurity https://falcosecurity.github.io/charts
helm repo update
helm install falco falcosecurity/falco \
  --namespace falco --create-namespace \
  --set driver.kind=ebpf \
  --set falcosidekick.enabled=true \
  --set falcosidekick.webui.enabled=true

Custom rules (saved as /etc/falco/rules.d/custom.yaml) example:

- rule: UnauthorizedProcessInContainer
  desc: Detect shell or package manager execution in production containers
  condition: >
    spawned_process and container and
    (proc.name in (sh, bash, ash, zsh, apt, apt-get, yum, dnf)) and
    container.image.repository != "debug-tools"
  output: "Unauthorized process started (user=%user.name command=%proc.cmdline container=%container.name image=%container.image.repository)"
  priority: WARNING
  tags: [process, mitre_execution]

- rule: WriteToNonWhitelistedDirectory
  desc: Detect file writes outside /tmp or /app/cache
  condition: >
    open_write and container and
    not fd.directory in (/tmp, /app/cache) and
    not fd.name startswith /proc
  output: "File write to unexpected location (file=%fd.name command=%proc.cmdline container=%container.name)"
  priority: ERROR
  tags: [filesystem, mitre_persistence]

- rule: OutboundConnectionToSuspiciousIP
  desc: Detect connections to known malicious IPs or unusual ports
  condition: >
    outbound and container and
    (fd.sport in (22, 3389, 4444, 6667) or fd.sip in (198.51.100.0/24))
  output: "Suspicious outbound connection (ip=%fd.sip port=%fd.sport command=%proc.cmdline container=%container.name)"
  priority: CRITICAL
  tags: [network, mitre_command_and_control]

Trigger a rule by executing a shell in a pod and view logs:

kubectl exec -it myapp-pod -- /bin/sh
kubectl logs -n falco -l app.kubernetes.io/name=falco | grep "Unauthorized process"

9. Image registry hardening with Harbor

Enable automatic scanning and severity enforcement via the Harbor API:

curl -u admin:Harbor12345 -X PUT \
  "https://harbor.example.com/api/v2.0/projects/myproject" \
  -H "Content-Type: application/json" \
  -d '{
    "metadata": {
      "auto_scan": "true",
      "severity": "high",
      "prevent_vul": "true"
    }
  }'

Configure a webhook policy that calls a Cosign verifier to reject unsigned images, and optionally add a CVE allowlist.

10. Periodic scanning and remediation

Example Bash script that scans all images running in a namespace, reports critical/high findings to Slack, and cleans old reports:

#!/bin/bash
set -euo pipefail
NAMESPACE="production"
REPORT_DIR="/var/log/trivy"
SLACK_WEBHOOK="https://hooks.slack.com/services/XXX"

mkdir -p "$REPORT_DIR"
IMAGES=$(kubectl get pods -n "$NAMESPACE" -o jsonpath='{.items[*].spec.containers[*].image}' | tr ' ' '
' | sort -u)

for IMAGE in $IMAGES; do
  SAFE=$(echo "$IMAGE" | tr '/:' '_')
  REPORT="$REPORT_DIR/${SAFE}_$(date +%Y%m%d).json"
  trivy image --severity CRITICAL,HIGH --format json --output "$REPORT" "$IMAGE"
  CRIT=$(jq '[.Results[].Vulnerabilities[] | select(.Severity=="CRITICAL")] | length' "$REPORT")
  HIGH=$(jq '[.Results[].Vulnerabilities[] | select(.Severity=="HIGH")] | length' "$REPORT")
  if [[ $CRIT -gt 0 || $HIGH -gt 5 ]]; then
    MSG="⚠️ Image $IMAGE has $CRIT CRITICAL and $HIGH HIGH vulnerabilities."
    curl -X POST -H 'Content-type: application/json' --data "{\"text\":\"$MSG\"}" "$SLACK_WEBHOOK"
  fi
done

# Delete reports older than 30 days
find "$REPORT_DIR" -name "*.json" -mtime +30 -delete

Deploy the script as a Kubernetes CronJob that runs daily at 02:00, using a ServiceAccount with read access to pods.

11. Monitoring and alerting

Prometheus can scrape Falco exporter metrics via a ServiceMonitor:

apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  name: falco
  namespace: falco
spec:
  selector:
    matchLabels:
      app.kubernetes.io/name: falco
  endpoints:
  - port: metrics
    interval: 30s

Sample alert rules (container‑security.yaml):

groups:
- name: container-security
  rules:
  - alert: HighSeverityVulnerabilitiesDetected
    expr: sum(falco_events_total{priority="Critical"}) > 10
    for: 5m
    labels:
      severity: critical
    annotations:
      summary: "High severity security events detected"
  - alert: UnauthorizedProcessExecution
    expr: rate(falco_events_total{rule="UnauthorizedProcessInContainer"}[5m]) > 0
    for: 2m
    labels:
      severity: warning
    annotations:
      summary: "Shell execution detected in production container"
  - alert: ImagePullFromUntrustedRegistry
    expr: kube_pod_container_info{image!~"registry.example.com/.*"} == 1
    for: 5m
    labels:
      severity: warning
    annotations:
      summary: "Pod using image from untrusted registry"
  - alert: PodRunningAsRoot
    expr: |
      kube_pod_container_status_running{} == 1 and
      on(namespace, pod, container) kube_pod_container_info{container_id!="", image!~".*debug.*"} unless on(namespace, pod, container) kube_pod_security_context_run_as_non_root == 1
    for: 10m
    labels:
      severity: warning
    annotations:
      summary: "Container running as root user"

12. Performance considerations

Typical Trivy scan times:

Alpine image ≈ 3‑5 s

Debian/Ubuntu ≈ 10‑20 s

Large > 1 GB ≈ 30‑60 s

Cache the vulnerability database on fast storage to reduce first‑run latency. Adjust Falco syscall_event_drops.threshold to limit CPU usage. Scale Harbor scan workers via max_job_workers according to CPU cores.

13. Compliance checks

Example mapping of standards to Kubernetes configuration:

CIS Docker 4.1 – runAsNonRoot: true CIS Docker 5.7 – readOnlyRootFilesystem: true NIST 800‑190 – CI Trivy scan before release

PCI‑DSS 6.2 – Weekly remediation of high‑severity CVEs

SOC 2 – Image signing with Cosign and audit logs

14. Common troubleshooting

Trivy timeout : network to NVD blocked – use --skip-db-update or an offline DB mirror.

Cosign verification fails : mismatched key or tag – verify the key pair and use image digest instead of mutable tag.

Pod rejected by PSA : violates Restricted – adjust SecurityContext or temporarily label namespace with baseline.

Falco high CPU : event sampling too aggressive – increase syscall_event_drops.threshold or switch to eBPF driver.

NetworkPolicy blocks legitimate traffic : missing egress rule – add the required to entry.

Harbor scan queue backlog : insufficient workers – raise max_job_workers and add scan nodes.

15. Change and rollback playbook

Canary deployment with pre‑flight scan and signature:

# Scan new image
trivy image --severity CRITICAL,HIGH myapp:v2.0 > scan.txt
if ! grep -q "Total: 0" scan.txt; then
  echo "Vulnerabilities found – abort"
  exit 1
fi

# Sign image
cosign sign --key cosign.key registry.example.com/myapp:v2.0

# Deploy canary (10% traffic)
kubectl set image deployment/myapp app=registry.example.com/myapp:v2.0
kubectl rollout pause deployment/myapp
kubectl wait --for=condition=ready pod -l app=myapp,version=v2.0 --timeout=300s

# Simple health check loop (5 min)
for i in {1..10}; do
  ERR=$(kubectl logs -l app=myapp,version=v2.0 --tail=100 | grep -c ERROR || true)
  if [[ $ERR -gt 5 ]]; then
    echo "High error rate – rolling back"
    kubectl rollout undo deployment/myapp
    exit 1
  fi
  sleep 30
done

# Full rollout
kubectl rollout resume deployment/myapp
kubectl rollout status deployment/myapp --timeout=600s

An emergency rollback script that tags the current image, rolls back to the previous revision, rescans the rolled‑back image, and sends a Slack alert follows the same pattern.

16. Best practices summary

Multi‑stage builds with Distroless or minimal base to reduce attack surface.

CI blocks CRITICAL, alerts HIGH, ignores MEDIUM unless required.

Enforce image signatures in production; use immutable digests.

Apply Restricted PSA, read‑only root filesystem, non‑root user.

Namespace‑level default‑deny NetworkPolicy with explicit egress whitelist.

Run Trivy on every build, daily full scan in production, real‑time Falco monitoring for critical services.

Vulnerability response SLA: CRITICAL ≤ 24 h, HIGH ≤ 7 d, MEDIUM ≤ 30 d assessment.

Manage false positives with .trivyignore and ticket tracking.

Use dedicated ServiceAccount for image pulls; disable default SA.

Enable Kubernetes audit logs and Harbor access logs; retain ≥ 90 days.

17. References

Trivy releases: https://github.com/aquasecurity/trivy/releases

Cosign documentation: https://github.com/sigstore/cosign

Falco Helm chart: https://falcosecurity.github.io/charts

Harbor API v2.0: https://github.com/goharbor/harbor

Kubernetes Pod Security Admission: https://kubernetes.io/docs/concepts/security/pod-security-admission/

Raymond Ops
Written by

Raymond Ops

Linux ops automation, cloud-native, Kubernetes, SRE, DevOps, Python, Golang and related tech discussions.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.