25 Common Kubernetes Pitfalls and How to Fix Them
This guide enumerates 25 frequent Kubernetes misconfigurations—from missing resource limits and using latest image tags to insecure pod security settings—and provides concrete remediation steps with ready‑to‑use YAML snippets, helping operators avoid common traps and improve cluster reliability.
Kubernetes is powerful, but misconfigurations can easily lead to outages, security issues, and operational pain. This article lists 25 high‑frequency pitfalls, each paired with a clear fix and a ready‑to‑copy YAML example.
Basic configuration pitfalls
1. Missing resource requests and limits
Problem: Pods without defined resources can monopolize CPU or memory, starving other workloads.
Solution: Define sensible requests and limits for each container.
resources:
requests:
cpu: "250m"
memory: "64Mi"
limits:
cpu: "500m"
memory: "128Mi"2. Using the latest image tag
Problem: The latest tag is mutable, making rollbacks difficult.
Solution: Pin images to a specific version.
containers:
- name: my-app
image: my-app:v1.2.3 # avoid latest3. No liveness or readiness probes
Problem: When a pod crashes, Kubernetes cannot detect it.
Solution: Add livenessProbe and readinessProbe.
livenessProbe:
httpGet:
path: /healthz
port: 8080
initialDelaySeconds: 10
periodSeconds: 10
readinessProbe:
httpGet:
path: /ready
port: 8080
initialDelaySeconds: 5
periodSeconds: 54. Storing plain‑text passwords in manifests
Problem: High risk of credential leakage.
Solution: Use a Secret to store sensitive data.
apiVersion: v1
kind: Secret
metadata:
name: db-secret
type: Opaque
data:
password: cGFzc3dvcmQ= # base64‑encoded
---
env:
- name: DB_PASSWORD
valueFrom:
secretKeyRef:
name: db-secret
key: password5. All pods landing on the same node
Problem: Single‑point‑of‑failure risk.
Solution: Configure pod anti‑affinity.
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: app
operator: In
values: ["my-app"]
topologyKey: "kubernetes.io/hostname"6. Rolling update kills all pods
Problem: Service becomes unavailable during upgrades.
Solution: Use a PodDisruptionBudget to keep a minimum number of replicas.
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
name: my-app-pdb
spec:
minAvailable: 2
selector:
matchLabels:
app: my-app7. Mixing environments in one namespace
Problem: Poor isolation leads to accidental cross‑environment changes.
Solution: Separate namespaces per environment.
apiVersion: v1
kind: Namespace
metadata:
name: production8. No unified log collection
Problem: Debugging becomes painful.
Solution: Deploy a log collector via a DaemonSet.
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: fluentd
spec:
selector:
matchLabels:
app: fluentd
template:
metadata:
labels:
app: fluentd
spec:
containers:
- name: fluentd
image: fluent/fluentd:v1.149. Over‑privileged ServiceAccount
Problem: Unnecessary security exposure.
Solution: Grant the minimal RBAC permissions.
apiVersion: v1
kind: ServiceAccount
metadata:
name: my-app-sa
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: my-app-role
rules:
- apiGroups: [""]
resources: ["pods"]
verbs: ["get", "list"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: my-app-rb
subjects:
- kind: ServiceAccount
name: my-app-sa
roleRef:
kind: Role
name: my-app-role
apiGroup: rbac.authorization.k8s.io10. Exposing services with NodePort
Problem: Insecure and hard to manage.
Solution: Use an Ingress resource.
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: my-app-ingress
spec:
rules:
- host: myapp.example.com
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: my-app
port:
number: 8011. Manual YAML edits without version control
Problem: Configuration drift and unreproducible environments.
Solution: Adopt GitOps for declarative management.
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
name: my-app
spec:
source:
repoURL: https://github.com/org/repo
targetRevision: main
path: manifests/my-app
destination:
namespace: production
server: https://kubernetes.default.svc
syncPolicy:
automated:
prune: true
selfHeal: true12. Running containers as root
Problem: Severe security risk.
Solution: Specify a non‑root user in the security context.
securityContext:
runAsUser: 1000
runAsGroup: 3000
fsGroup: 200013. Copy‑pasting raw YAML without templating
Problem: Hard to maintain and upgrade.
Solution: Use Helm or Kustomize for templating.
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources:
- deployment.yaml
- service.yaml
- ingress.yaml14. Using temporary storage leads to data loss
Problem: Pods lose data after recreation.
Solution: Attach a PersistentVolumeClaim.
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: my-pvc
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi15. Job restart policy misconfiguration
Problem: Jobs restart indefinitely.
Solution: Set restartPolicy: OnFailure.
apiVersion: batch/v1
kind: Job
metadata:
name: my-job
spec:
template:
spec:
restartPolicy: OnFailure
containers:
- name: job
image: busybox
command: ["echo", "hello"]Advanced pitfalls
16. Poor node scheduling
Problem: Compute and storage resources are allocated inefficiently.
Solution: Define NodeAffinity rules.
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: node-type
operator: In
values: ["gpu"]17. Using Deployment for stateful services
Problem: Data loss on pod recreation.
Solution: Deploy a StatefulSet with a volume claim template.
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: mysql
spec:
serviceName: mysql
replicas: 3
selector:
matchLabels:
app: mysql
template:
metadata:
labels:
app: mysql
spec:
containers:
- name: mysql
image: mysql:8
volumeMounts:
- name: data
mountPath: /var/lib/mysql
volumeClaimTemplates:
- metadata:
name: data
spec:
accessModes: ["ReadWriteOnce"]
resources:
requests:
storage: 5Gi18. Low‑priority tasks pre‑empt critical services
Solution: Create a PriorityClass with a high value for important workloads.
apiVersion: scheduling.k8s.io/v1
kind: PriorityClass
metadata:
name: high-priority
value: 100000
globalDefault: false
description: "For critical workloads"19. Dependent services not ready
Solution: Use an initContainer to wait for required services.
initContainers:
- name: init-db
image: busybox
command: ['sh', '-c', 'until nc -z db 3306; do sleep 2; done;']20. No etcd backups
Solution: Schedule regular snapshots with a CronJob.
apiVersion: batch/v1
kind: CronJob
metadata:
name: etcd-backup
spec:
schedule: "0 */6 * * *"
jobTemplate:
spec:
template:
spec:
containers:
- name: backup
image: bitnami/etcd
command: ["/bin/sh", "-c", "etcdctl snapshot save /backup/etcd-$(date +%F).db"]
restartPolicy: OnFailure21. Monolithic app forced into Kubernetes
Solution: Refactor into microservices with separate deployments.
apiVersion: apps/v1
kind: Deployment
metadata:
name: user-service
spec:
replicas: 3
selector:
matchLabels:
app: user
template:
metadata:
labels:
app: user
spec:
containers:
- name: user
image: user-service:v1.022. Ignoring Pod security policies
Solution: Enable the PodSecurity admission controller and enforce a restricted policy.
apiVersion: v1
kind: Namespace
metadata:
name: secure-ns
labels:
pod-security.kubernetes.io/enforce: restricted23. No automatic scaling
Solution: Deploy a Horizontal Pod Autoscaler (HPA).
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: my-app-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: my-app
minReplicas: 2
maxReplicas: 10
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 7024. All pods can communicate by default
Solution: Define a default‑deny NetworkPolicy.
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: deny-all
spec:
podSelector: {}
policyTypes:
- Ingress
- Egress25. No tenant isolation across environments
Solution: Create separate namespaces for each environment (prod, staging, dev) to achieve multi‑tenant isolation.
apiVersion: v1
kind: Namespace
metadata:
name: prod
---
apiVersion: v1
kind: Namespace
metadata:
name: staging
---
apiVersion: v1
kind: Namespace
metadata:
name: devSigned-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Ray's Galactic Tech
Practice together, never alone. We cover programming languages, development tools, learning methods, and pitfall notes. We simplify complex topics, guiding you from beginner to advanced. Weekly practical content—let's grow together!
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
