Master Container Security: Complete Guide to Image Scanning and Zero‑Trust Runtime Protection
This comprehensive guide walks you through securing container workloads by defining applicable scenarios, setting up prerequisites, installing Trivy and Falco, hardening Dockerfiles, integrating CI/CD scanning and signing, configuring Kubernetes security contexts, network policies, pod security admission, runtime protection, Harbor registry hardening, regular scanning, monitoring, troubleshooting, and best‑practice recommendations.
Applicable Scenarios & Prerequisites
Applicable scenarios : production container environment hardening, CI/CD automated vulnerability detection, image supply‑chain security, runtime attack defense.
Prerequisites include OS (RHEL 8+/Ubuntu 20.04+ with kernel 5.4+), container runtime (Docker 20.10+/containerd 1.6+/CRI‑O 1.24+), Kubernetes 1.25+, root or admin rights, image registry read/write, network access to NVD/GitHub Advisory, and tools Trivy 0.48+, Cosign 2.0+, optional Falco.
Environment & Version Matrix
Supported OS kernels, Docker, containerd, Kubernetes versions and minimum hardware specifications are listed (e.g., RHEL kernel 4.18+, Docker 20.10.23+, Trivy 0.48+, etc.).
Quick Checklist
Install scanning tools (Trivy/Grype) and verify.
Configure Dockerfile with multi‑stage build and non‑root user.
Integrate CI/CD pipeline to block high‑severity images.
Deploy image signing and verification (Cosign + OPA Gatekeeper).
Set Kubernetes SecurityContext (readOnlyRootFilesystem, runAsNonRoot).
Apply minimal NetworkPolicy.
Enable Pod Security Admission (Restricted).
Deploy runtime protection (Falco rules + alerts).
Configure Harbor image scanning and signature verification.
Establish regular scanning and remediation workflow.
Implementation Steps
Step 1: Install and configure Trivy
RHEL/CentOS installation commands:
# Install Trivy
sudo rpm -ivh https://github.com/aquasecurity/trivy/releases/download/v0.48.3/trivy_0.48.3_Linux-64bit.rpm
# Verify installation
trivy --version
# Update vulnerability database
trivy image --download-db-only
# Check DB path
ls -lh ~/.cache/trivy/db/Ubuntu/Debian installation commands:
# Install Trivy
wget https://github.com/aquasecurity/trivy/releases/download/v0.48.3/trivy_0.48.3_Linux-64bit.deb
sudo dpkg -i trivy_0.48.3_Linux-64bit.deb
# Offline mode (optional)
trivy image --download-db-only --cache-dir /opt/trivy-db
export TRIVY_CACHE_DIR=/opt/trivy-dbKey parameters: --severity CRITICAL,HIGH, --exit-code 1, --ignore-unfixed.
Example scan of official Nginx image and JSON output.
Step 2: Dockerfile security baseline
Multi‑stage build example with distroless runtime and non‑root user.
# Build stage
FROM golang:1.21-alpine AS builder
WORKDIR /app
COPY go.mod go.sum ./
RUN go mod download
COPY . .
RUN CGO_ENABLED=0 GOOS=linux go build -a -ldflags '-extldflags "-static"' -o myapp
# Runtime stage
FROM gcr.io/distroless/static-debian12:nonroot
USER nonroot:nonroot
WORKDIR /app
COPY --from=builder --chown=nonroot:nonroot /app/myapp .
EXPOSE 8080
ENTRYPOINT ["/app/myapp"]Key practices: multi‑stage build, distroless image, non‑root user.
Step 3: CI/CD integration (GitLab CI example)
stages:
- build
- scan
- sign
- deploy
variables:
IMAGE_NAME: myapp
IMAGE_TAG: $CI_COMMIT_SHORT_SHA
REGISTRY: registry.example.com
build:
stage: build
image: docker:24-dind
script:
- docker build -t $REGISTRY/$IMAGE_NAME:$IMAGE_TAG .
- docker push $REGISTRY/$IMAGE_NAME:$IMAGE_TAG
security-scan:
stage: scan
image: aquasec/trivy:latest
script:
- trivy image --exit-code 1 --severity CRITICAL $REGISTRY/$IMAGE_NAME:$IMAGE_TAG
- trivy image --severity HIGH,CRITICAL --format json --output scan-report.json $REGISTRY/$IMAGE_NAME:$IMAGE_TAG
artifacts:
reports:
container_scanning: scan-report.json
expire_in: 30 days
allow_failure: false
sign-image:
stage: sign
image: gcr.io/projectsigstore/cosign:v2.2
script:
- cosign sign --key cosign.key $REGISTRY/$IMAGE_NAME:$IMAGE_TAG
only:
- mainStep 4: Image signing and verification (Cosign)
# Generate key pair
cosign generate-key-pair
# Sign image
cosign sign --key cosign.key registry.example.com/myapp:v1.0
# Verify signature
cosign verify --key cosign.pub registry.example.com/myapp:v1.0Step 5: Kubernetes SecurityContext configuration
apiVersion: v1
kind: Pod
metadata:
name: secure-app
spec:
securityContext:
runAsNonRoot: true
runAsUser: 10000
fsGroup: 10000
seccompProfile:
type: RuntimeDefault
containers:
- name: app
image: registry.example.com/myapp:v1.0
securityContext:
allowPrivilegeEscalation: false
readOnlyRootFilesystem: true
capabilities:
drop: ["ALL"]
add: ["NET_BIND_SERVICE"]
volumeMounts:
- name: tmp
mountPath: /tmp
- name: cache
mountPath: /app/cache
volumes:
- name: tmp
emptyDir: {}
- name: cache
emptyDir: {}Step 6: NetworkPolicy minimal exposure
# Default deny all
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: default-deny-all
namespace: production
spec:
podSelector: {}
policyTypes:
- Ingress
- Egress
# Whitelist egress for app
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: allow-app-egress
namespace: production
spec:
podSelector:
matchLabels:
app: myapp
policyTypes:
- Egress
egress:
- to:
- podSelector:
matchLabels:
app: mysql
ports:
- protocol: TCP
port: 3306
- to:
- namespaceSelector:
matchLabels:
name: kube-system
podSelector:
matchLabels:
k8s-app: kube-dns
ports:
- protocol: UDP
port: 53
- to:
- ipBlock:
cidr: 203.0.113.0/24
ports:
- protocol: TCP
port: 443Step 7: Pod Security Admission (PSA)
# Verify current PSA
kubectl get ns production -o yaml | grep pod-security
# Enforce restricted policy
kubectl label namespace production \
pod-security.kubernetes.io/enforce=restricted \
pod-security.kubernetes.io/audit=restricted \
pod-security.kubernetes.io/warn=restrictedStep 8: Runtime protection with Falco
Install Falco via Helm in eBPF mode and configure custom rules.
# Add Helm repo
helm repo add falcosecurity https://falcosecurity.github.io/charts
helm repo update
# Install Falco
helm install falco falcosecurity/falco \
--namespace falco --create-namespace \
--set driver.kind=ebpf \
--set falcosidekick.enabled=true \
--set falcosidekick.webui.enabled=trueExample custom rule to detect unauthorized processes.
- rule: UnauthorizedProcessInContainer
desc: Detect shell or package manager execution in production containers
condition: >
spawned_process and container and
(proc.name in (sh, bash, ash, zsh, apt, apt-get, yum, dnf)) and
container.image.repository != "debug-tools"
output: Unauthorized process started (user=%user.name command=%proc.cmdline container=%container.name image=%container.image.repository)
priority: WARNING
tags: [process, mitre_execution]Step 9: Harbor image registry security
Enable automatic scanning, configure vulnerability severity thresholds, and set up webhook for Cosign verification.
# Enable auto scan
curl -u admin:Harbor12345 -X PUT "https://harbor.example.com/api/v2.0/projects/myproject" \
-H "Content-Type: application/json" \
-d '{"metadata":{"auto_scan":"true","severity":"high","prevent_vul":"true"}}'
# Configure webhook for signature verification
curl -u admin:Harbor12345 -X POST "https://harbor.example.com/api/v2.0/projects/myproject/webhook/policies" \
-H "Content-Type: application/json" \
-d '{
"name":"Verify Cosign Signature",
"targets":[{"type":"http","address":"http://cosign-verifier.default.svc/webhook","skip_cert_verify":false}],
"event_types":["PUSH_ARTIFACT"],
"enabled":true
}'Step 10: Periodic scanning and remediation
Example Bash script scheduled as a Kubernetes CronJob to scan running images with Trivy, report critical/high findings to Slack, and clean old reports.
#!/bin/bash
set -euo pipefail
NAMESPACE="production"
REPORT_DIR="/var/log/trivy"
SLACK_WEBHOOK="https://hooks.slack.com/services/XXX"
mkdir -p "$REPORT_DIR"
IMAGES=$(kubectl get pods -n "$NAMESPACE" -o jsonpath='{.items[*].spec.containers[*].image}' | tr ' ' '
' | sort -u)
for IMAGE in $IMAGES; do
SAFE_NAME=$(echo "$IMAGE" | tr '/:' '_')
REPORT="${REPORT_DIR}/${SAFE_NAME}_$(date +%Y%m%d).json"
echo "Scanning $IMAGE..."
trivy image --severity CRITICAL,HIGH --format json --output "$REPORT" "$IMAGE"
CRITICAL=$(jq '[.Results[].Vulnerabilities[] | select(.Severity=="CRITICAL")] | length' "$REPORT")
HIGH=$(jq '[.Results[].Vulnerabilities[] | select(.Severity=="HIGH")] | length' "$REPORT")
if [[ $CRITICAL -gt 0 || $HIGH -gt 5 ]]; then
MESSAGE="⚠️ Image $IMAGE has $CRITICAL CRITICAL and $HIGH HIGH vulnerabilities. Report: $REPORT"
curl -X POST -H 'Content-type: application/json' --data "{\"text\":\"$MESSAGE\"}" "$SLACK_WEBHOOK"
fi
done
# Cleanup reports older than 30 days
find "$REPORT_DIR" -name "*.json" -mtime +30 -deleteMonitoring & Alerting
Prometheus ServiceMonitor for Falco, alert rules for critical vulnerabilities, unsigned images, Falco threats, and privileged containers. Example Grafana panels for top vulnerable images and Falco events.
Performance & Capacity
Benchmark results for Trivy scan times, CPU/memory usage of Trivy, Falco, Harbor scanning, and Cosign verification. Recommendations for hardware sizing for small, medium, and large teams.
Security & Compliance
Best‑practice checklist covering CIS Docker, NIST 800‑190, PCI‑DSS, SOC 2, GDPR. Example commands for verifying non‑root user, read‑only filesystem, and image signing.
Common Issues & Troubleshooting
Trivy scan timeout – check network or use offline DB.
Cosign verification failure – ensure matching key or use digest.
Pod blocked by PSA – adjust policy or fix securityContext.
Falco high CPU – tune event_drops threshold or use eBPF.
NetworkPolicy blocking traffic – add egress rules.
Harbor scan queue – increase max_job_workers.
Image pull failure due to signature – sign all images or disable policy.
Trivy false positives – update DB or use .trivyignore.
Falco false alerts – refine rules or whitelist.
Change & Rollback Playbooks
Canary deployment script with Trivy scan, Cosign signing, gradual rollout, health checks, and automatic rollback on error.
#!/bin/bash
# Scan new image
if trivy image --severity CRITICAL,HIGH myapp:v2.0 | grep -q "Total: 0"; then
echo "✓ No critical vulnerabilities"
else
echo "✗ Vulnerabilities found, blocking deployment"
exit 1
fi
# Sign image
cosign sign --key cosign.key registry.example.com/myapp:v2.0
# Canary rollout (10% traffic)
kubectl set image deployment/myapp app=myapp:v2.0
kubectl rollout pause deployment/myapp
# wait for canary pods...
# health check loop...
# if error, rollback
# else resume full rolloutEmergency rollback script records current image, tags it as backup, rolls back to previous revision, rescans, and sends Slack alert.
#!/bin/bash
set -euo pipefail
NAMESPACE="production"
DEPLOYMENT="myapp"
CURRENT_IMAGE=$(kubectl get deployment $DEPLOYMENT -n $NAMESPACE -o jsonpath='{.spec.template.spec.containers[0].image}')
docker tag "$CURRENT_IMAGE" myapp:last-good-$(date +%Y%m%d)
docker push myapp:last-good-$(date +%Y%m%d)
LAST_GOOD=$(kubectl rollout history deployment/$DEPLOYMENT -n $NAMESPACE | grep -B 1 "last-good" | head -1 | awk '{print $1}')
kubectl rollout undo deployment/$DEPLOYMENT -n $NAMESPACE --to-revision=$LAST_GOOD
kubectl rollout status deployment/$DEPLOYMENT -n $NAMESPACE --timeout=300s
trivy image --severity CRITICAL,HIGH "$CURRENT_IMAGE" > /tmp/vulnerable-image-report.json
curl -X POST -H 'Content-type: application/json' --data "{\"text\":\"⚠️ EMERGENCY ROLLBACK: $DEPLOYMENT rolled back from $CURRENT_IMAGE due to security issue.\"}" $SLACK_WEBHOOKBest Practices
Use multi‑stage builds and distroless bases to reduce attack surface.
CI/CD blocks CRITICAL, alerts HIGH, ignores MEDIUM unless required.
Enforce image signing in production; use digests.
Apply Restricted PSA, read‑only root, non‑root user.
Namespace‑level default‑deny NetworkPolicy with fine‑grained egress.
Combine CI scans with daily full scans and real‑time Falco monitoring.
Follow SLA: fix CRITICAL within 24 h, HIGH within 7 d, assess MEDIUM within 30 d.
Manage false positives with .trivyignore and ticket tracking.
Grant minimal RBAC to image‑pull service accounts.
Enable Kubernetes audit logs and Harbor access logs for 90‑day retention.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
MaGe Linux Operations
Founded in 2009, MaGe Education is a top Chinese high‑end IT training brand. Its graduates earn 12K+ RMB salaries, and the school has trained tens of thousands of students. It offers high‑pay courses in Linux cloud operations, Python full‑stack, automation, data analysis, AI, and Go high‑concurrency architecture. Thanks to quality courses and a solid reputation, it has talent partnerships with numerous internet firms.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
