Master Production‑Grade Kubernetes YAML: 10+ Security & Performance Checks
This guide presents a comprehensive, production‑ready Kubernetes YAML checklist covering over ten essential security, stability, observability, and scalability validation points, plus eight advanced best‑practice recommendations, enabling teams to create robust, maintainable, and automated configuration pipelines.
Why a YAML Standard Is the Foundation of Production Stability
Kubernetes YAML defines the security boundaries, resource allocation, observability, and runtime logic of a cluster. Without a consistent, constrained specification, configurations become error‑prone and difficult to operate at scale.
1. Security – Run Containers as Non‑Root
Problem: Containers that run as the root user can compromise the host if a vulnerability is exploited.
Best Practice
Declare a non‑root USER in the Dockerfile.
Set runAsNonRoot and a specific UID/GID in the Pod securityContext.
apiVersion: v1
kind: Pod
metadata:
name: security-context-demo
spec:
securityContext:
runAsNonRoot: true
runAsUser: 1000
fsGroup: 2000
containers:
- name: myapp
image: myapp:latest
securityContext:
allowPrivilegeEscalation: false
capabilities:
drop: ["ALL"]Validate with kube-score (checks securityContext) or kube-linter (ensures non‑root).
2. Resource Management – Define Requests & Limits
Omitting resource specifications can cause pod starvation, inefficient scheduling, and ineffective Horizontal/Vertical Pod Autoscaling.
resources:
requests:
memory: "128Mi"
cpu: "250m"
limits:
memory: "256Mi"
cpu: "500m"Advice: Memory limits are hard (exceeding triggers OOMKilled); CPU limits are soft (exceeding triggers throttling). Run kube-score score my-deployment.yaml to verify.
3. Robustness – Liveness & Readiness Probes
livenessProbe:
httpGet:
path: /healthz
port: 8080
initialDelaySeconds: 15
periodSeconds: 20
readinessProbe:
httpGet:
path: /ready
port: 8080
initialDelaySeconds: 5
periodSeconds: 5Liveness probes prevent endless restart loops; readiness probes ensure a pod is only added to service endpoints after it can serve traffic.
4. Maintainability – Standardized Labels & Annotations
metadata:
labels:
app: myapp
component: frontend
version: "v1.2.3"
environment: production
annotations:
gitCommit: "a1b2c3d4"
description: "Frontend service for MyApp"
prometheus.io/scrape: "true"
prometheus.io/port: "9102"Recommended mandatory labels: app, component, version. Optional but useful: environment, team.
5. Network Security – Zero‑Trust with NetworkPolicy
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: backend-deny-all
spec:
podSelector:
matchLabels:
app: mybackend
policyTypes:
- Ingress
- EgressDefine allowed communication explicitly:
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: frontend-to-backend
spec:
podSelector:
matchLabels:
app: mybackend
ingress:
- from:
- podSelector:
matchLabels:
app: myfrontend6. Data Security – Manage Secrets Properly
apiVersion: v1
kind: Secret
metadata:
name: mysecret
type: Opaque
data:
username: YWRtaW4=
password: MWYyZDFlMmU2N2RmReference the secret in a pod:
env:
- name: SECRET_USERNAME
valueFrom:
secretKeyRef:
name: mysecret
key: usernameAdvanced: integrate external secret stores (Vault, AWS Secrets Manager) via the CSI driver.
7. Resilience – PodDisruptionBudget (PDB)
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
name: myapp-pdb
spec:
minAvailable: 2
selector:
matchLabels:
app: myapp8. Availability – Replicas & Anti‑Affinity
replicas: 3
affinity:
podAntiAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 100
podAffinityTerm:
labelSelector:
matchExpressions:
- key: app
operator: In
values: ["myapp"]
topologyKey: kubernetes.io/hostname9. Observability – Standard Logging & Metrics
Write logs to stdout / stderr (structured JSON is preferred).
Expose a Prometheus scrape endpoint via annotations.
annotations:
prometheus.io/scrape: "true"
prometheus.io/port: "9090"
prometheus.io/path: "/metrics"10. Image Management – Pin Versions & Pull Policies
Prohibited: image: myapp:latest Recommended:
image: my-registry.com/myapp:v1.2.3
imagePullPolicy: IfNotPresentBest practice: use an immutable digest.
image: my-registry.com/myapp@sha256:0a5f21e33e53d...Advanced Production‑Level Add‑Ons
11. Naming & File Organization
manifests/
├── base/
│ ├── deployment.yaml
│ ├── service.yaml
├── overlays/
│ ├── dev/
│ ├── prod/
└── README.mdUse Kustomize for environment‑specific overlays.
12. ConfigMap for Environment Variables
apiVersion: v1
kind: ConfigMap
metadata:
name: app-config
immutable: true
data:
DB_HOST: db.prod.svc.cluster.local13. Lifecycle Hooks
lifecycle:
preStop:
exec:
command: ["sh", "-c", "sleep 5 && curl -X POST http://myapp:8080/deregister"]14. Namespace & RBAC Isolation
apiVersion: v1
kind: ServiceAccount
metadata:
name: myapp-sa
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: myapp-role
rules:
- apiGroups: [""]
resources: ["pods"]
verbs: ["get", "list"]15. Rolling Update & Rollback Strategy
strategy:
type: RollingUpdate
rollingUpdate:
maxUnavailable: 1
maxSurge: 1
revisionHistoryLimit: 516. Image Registry & Private Authentication
imagePullSecrets:
- name: registry-credentialsWorks with Harbor, Alibaba Cloud ACR, Tencent TCR, etc.
17. Metadata Tracking & Owner Annotations
metadata:
annotations:
owner: "team-devops"
lastModified: "2025-10-12"
createdBy: "ci-pipeline"18. Team‑Level YAML Template Repository
Create a repository k8s-yaml-standards/ that contains:
Standard templates (Deployment, Service, Ingress).
Validation rules (kube‑score, kube‑linter, Trivy).
Automation scripts and CI pipelines.
Automation Toolchain & Integration Examples
kubeval : validates Kubernetes API schema. Example: kubeval --strict manifests/ kube-score : static analysis of configuration risks. Example: kube-score score manifests/ kube-linter : detects YAML anti‑patterns. Example: kube-linter lint manifests/ trivy : scans container images for vulnerabilities. Example: trivy image myapp:v1.2.3 polaris : audits runtime cluster policies via CLI or dashboard.
GitHub Actions Example
- name: Validate Kubernetes YAML
run: |
kubeval --strict manifests/
kube-score score manifests/
trivy image myapp:v1.2.3Self‑Check Checklist
Non‑root user – verify runAsNonRoot (tool: kube‑linter, priority: High).
Resource limits – verify resources.* (tool: kube‑score, priority: High).
Liveness & readiness probes – verify livenessProbe / readinessProbe (tool: kube‑linter, priority: High).
Label standards – verify metadata.labels (tool: kubeval, priority: Medium).
NetworkPolicy – verify NetworkPolicy objects (tool: polaris, priority: High).
Secret management – verify Secret usage and avoid plain text (tool: kube‑linter, priority: High).
PodDisruptionBudget – verify PodDisruptionBudget (tool: polaris, priority: Medium).
Replica distribution – verify affinity rules (tool: kube‑score, priority: Medium).
Logging & metrics – manual check for stdout and Prometheus annotations (priority: Medium).
Image version lock – ensure fixed tag or digest (tool: polaris, priority: High).
Conclusion
Production‑grade YAML is not just a configuration file; it is the defensive layer of the system. It must be explicit, defensive, auditable, and designed for automation.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Ray's Galactic Tech
Practice together, never alone. We cover programming languages, development tools, learning methods, and pitfall notes. We simplify complex topics, guiding you from beginner to advanced. Weekly practical content—let's grow together!
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
