Cloud Native 12 min read

25 Common Kubernetes Pitfalls and How to Fix Them

This guide enumerates 25 frequent Kubernetes misconfigurations—from missing resource limits and using latest image tags to insecure pod security settings—and provides concrete remediation steps with ready‑to‑use YAML snippets, helping operators avoid common traps and improve cluster reliability.

Ray's Galactic Tech
Ray's Galactic Tech
Ray's Galactic Tech
25 Common Kubernetes Pitfalls and How to Fix Them

Kubernetes is powerful, but misconfigurations can easily lead to outages, security issues, and operational pain. This article lists 25 high‑frequency pitfalls, each paired with a clear fix and a ready‑to‑copy YAML example.

Basic configuration pitfalls

1. Missing resource requests and limits

Problem: Pods without defined resources can monopolize CPU or memory, starving other workloads.

Solution: Define sensible requests and limits for each container.

resources:
  requests:
    cpu: "250m"
    memory: "64Mi"
  limits:
    cpu: "500m"
    memory: "128Mi"

2. Using the latest image tag

Problem: The latest tag is mutable, making rollbacks difficult.

Solution: Pin images to a specific version.

containers:
- name: my-app
  image: my-app:v1.2.3  # avoid latest

3. No liveness or readiness probes

Problem: When a pod crashes, Kubernetes cannot detect it.

Solution: Add livenessProbe and readinessProbe.

livenessProbe:
  httpGet:
    path: /healthz
    port: 8080
  initialDelaySeconds: 10
  periodSeconds: 10

readinessProbe:
  httpGet:
    path: /ready
    port: 8080
  initialDelaySeconds: 5
  periodSeconds: 5

4. Storing plain‑text passwords in manifests

Problem: High risk of credential leakage.

Solution: Use a Secret to store sensitive data.

apiVersion: v1
kind: Secret
metadata:
  name: db-secret
type: Opaque
data:
  password: cGFzc3dvcmQ=  # base64‑encoded
---
env:
- name: DB_PASSWORD
  valueFrom:
    secretKeyRef:
      name: db-secret
      key: password

5. All pods landing on the same node

Problem: Single‑point‑of‑failure risk.

Solution: Configure pod anti‑affinity.

affinity:
  podAntiAffinity:
    requiredDuringSchedulingIgnoredDuringExecution:
    - labelSelector:
        matchExpressions:
        - key: app
          operator: In
          values: ["my-app"]
      topologyKey: "kubernetes.io/hostname"

6. Rolling update kills all pods

Problem: Service becomes unavailable during upgrades.

Solution: Use a PodDisruptionBudget to keep a minimum number of replicas.

apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: my-app-pdb
spec:
  minAvailable: 2
  selector:
    matchLabels:
      app: my-app

7. Mixing environments in one namespace

Problem: Poor isolation leads to accidental cross‑environment changes.

Solution: Separate namespaces per environment.

apiVersion: v1
kind: Namespace
metadata:
  name: production

8. No unified log collection

Problem: Debugging becomes painful.

Solution: Deploy a log collector via a DaemonSet.

apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: fluentd
spec:
  selector:
    matchLabels:
      app: fluentd
  template:
    metadata:
      labels:
        app: fluentd
    spec:
      containers:
      - name: fluentd
        image: fluent/fluentd:v1.14

9. Over‑privileged ServiceAccount

Problem: Unnecessary security exposure.

Solution: Grant the minimal RBAC permissions.

apiVersion: v1
kind: ServiceAccount
metadata:
  name: my-app-sa
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: my-app-role
rules:
- apiGroups: [""]
  resources: ["pods"]
  verbs: ["get", "list"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: my-app-rb
subjects:
- kind: ServiceAccount
  name: my-app-sa
roleRef:
  kind: Role
  name: my-app-role
  apiGroup: rbac.authorization.k8s.io

10. Exposing services with NodePort

Problem: Insecure and hard to manage.

Solution: Use an Ingress resource.

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: my-app-ingress
spec:
  rules:
  - host: myapp.example.com
    http:
      paths:
      - path: /
        pathType: Prefix
        backend:
          service:
            name: my-app
            port:
              number: 80

11. Manual YAML edits without version control

Problem: Configuration drift and unreproducible environments.

Solution: Adopt GitOps for declarative management.

apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
  name: my-app
spec:
  source:
    repoURL: https://github.com/org/repo
    targetRevision: main
    path: manifests/my-app
  destination:
    namespace: production
    server: https://kubernetes.default.svc
  syncPolicy:
    automated:
      prune: true
      selfHeal: true

12. Running containers as root

Problem: Severe security risk.

Solution: Specify a non‑root user in the security context.

securityContext:
  runAsUser: 1000
  runAsGroup: 3000
  fsGroup: 2000

13. Copy‑pasting raw YAML without templating

Problem: Hard to maintain and upgrade.

Solution: Use Helm or Kustomize for templating.

apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources:
- deployment.yaml
- service.yaml
- ingress.yaml

14. Using temporary storage leads to data loss

Problem: Pods lose data after recreation.

Solution: Attach a PersistentVolumeClaim.

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: my-pvc
spec:
  accessModes:
  - ReadWriteOnce
  resources:
    requests:
      storage: 1Gi

15. Job restart policy misconfiguration

Problem: Jobs restart indefinitely.

Solution: Set restartPolicy: OnFailure.

apiVersion: batch/v1
kind: Job
metadata:
  name: my-job
spec:
  template:
    spec:
      restartPolicy: OnFailure
      containers:
      - name: job
        image: busybox
        command: ["echo", "hello"]

Advanced pitfalls

16. Poor node scheduling

Problem: Compute and storage resources are allocated inefficiently.

Solution: Define NodeAffinity rules.

affinity:
  nodeAffinity:
    requiredDuringSchedulingIgnoredDuringExecution:
      nodeSelectorTerms:
      - matchExpressions:
        - key: node-type
          operator: In
          values: ["gpu"]

17. Using Deployment for stateful services

Problem: Data loss on pod recreation.

Solution: Deploy a StatefulSet with a volume claim template.

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: mysql
spec:
  serviceName: mysql
  replicas: 3
  selector:
    matchLabels:
      app: mysql
  template:
    metadata:
      labels:
        app: mysql
    spec:
      containers:
      - name: mysql
        image: mysql:8
        volumeMounts:
        - name: data
          mountPath: /var/lib/mysql
  volumeClaimTemplates:
  - metadata:
      name: data
    spec:
      accessModes: ["ReadWriteOnce"]
      resources:
        requests:
          storage: 5Gi

18. Low‑priority tasks pre‑empt critical services

Solution: Create a PriorityClass with a high value for important workloads.

apiVersion: scheduling.k8s.io/v1
kind: PriorityClass
metadata:
  name: high-priority
value: 100000
globalDefault: false
description: "For critical workloads"

19. Dependent services not ready

Solution: Use an initContainer to wait for required services.

initContainers:
- name: init-db
  image: busybox
  command: ['sh', '-c', 'until nc -z db 3306; do sleep 2; done;']

20. No etcd backups

Solution: Schedule regular snapshots with a CronJob.

apiVersion: batch/v1
kind: CronJob
metadata:
  name: etcd-backup
spec:
  schedule: "0 */6 * * *"
  jobTemplate:
    spec:
      template:
        spec:
          containers:
          - name: backup
            image: bitnami/etcd
            command: ["/bin/sh", "-c", "etcdctl snapshot save /backup/etcd-$(date +%F).db"]
          restartPolicy: OnFailure

21. Monolithic app forced into Kubernetes

Solution: Refactor into microservices with separate deployments.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: user-service
spec:
  replicas: 3
  selector:
    matchLabels:
      app: user
  template:
    metadata:
      labels:
        app: user
    spec:
      containers:
      - name: user
        image: user-service:v1.0

22. Ignoring Pod security policies

Solution: Enable the PodSecurity admission controller and enforce a restricted policy.

apiVersion: v1
kind: Namespace
metadata:
  name: secure-ns
  labels:
    pod-security.kubernetes.io/enforce: restricted

23. No automatic scaling

Solution: Deploy a Horizontal Pod Autoscaler (HPA).

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: my-app-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: my-app
  minReplicas: 2
  maxReplicas: 10
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 70

24. All pods can communicate by default

Solution: Define a default‑deny NetworkPolicy.

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: deny-all
spec:
  podSelector: {}
  policyTypes:
  - Ingress
  - Egress

25. No tenant isolation across environments

Solution: Create separate namespaces for each environment (prod, staging, dev) to achieve multi‑tenant isolation.

apiVersion: v1
kind: Namespace
metadata:
  name: prod
---
apiVersion: v1
kind: Namespace
metadata:
  name: staging
---
apiVersion: v1
kind: Namespace
metadata:
  name: dev
Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

KubernetesDevOpsYAML
Ray's Galactic Tech
Written by

Ray's Galactic Tech

Practice together, never alone. We cover programming languages, development tools, learning methods, and pitfall notes. We simplify complex topics, guiding you from beginner to advanced. Weekly practical content—let's grow together!

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.