Operations 12 min read

What One RBAC Mistake Taught Me the Hard Way: Kubernetes Production Security Lessons

A late‑night production outage caused by a mis‑configured RBAC role sparked a deep dive into Kubernetes security, covering the principle of least privilege, proper ServiceAccount usage, network policies, audit scripts, and a practical checklist to harden clusters and avoid costly incidents.

Raymond Ops
Raymond Ops
Raymond Ops
What One RBAC Mistake Taught Me the Hard Way: Kubernetes Production Security Lessons

Background

At 3 am an alarm sounded: an unauthorized access alert on a live Kubernetes cluster. An intern had mistakenly granted excessive RBAC permissions, almost deleting core services. The incident highlighted that Kubernetes security is not optional—it must be treated as a mandatory requirement.

1. RBAC – Did You Configure It Correctly?

1.1 The Art of Least‑Privilege

Many teams give the cluster-admin role to developers for convenience, which is equivalent to handing every key in a house to a temporary worker.

Practical configuration example:

# Development environment – developer role
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  namespace: dev
  name: developer-role
rules:
- apiGroups: [""]
  resources: ["pods", "pods/log", "pods/exec"]
  verbs: ["get", "list", "watch"]
- apiGroups: ["apps"]
  resources: ["deployments"]
  verbs: ["get", "list", "watch", "update", "patch"]
---
# Production environment – ops manager (tiered)
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: ops-manager
rules:
- apiGroups: [""]
  resources: ["nodes", "namespaces"]
  verbs: ["get", "list", "watch"]
- apiGroups: ["apps"]
  resources: ["deployments", "daemonsets", "statefulsets"]
  verbs: ["get", "list", "watch", "update", "patch"]
- apiGroups: [""]
  resources: ["pods/exec"]
  verbs: ["create"] # only for emergencies

1.2 Proper ServiceAccount Usage

Using the default ServiceAccount for every pod is dangerous. Create a dedicated ServiceAccount per application and bind only the required permissions.

# Create a dedicated ServiceAccount
apiVersion: v1
kind: ServiceAccount
metadata:
  name: app-reader
  namespace: production
---
# Bind minimal permissions
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: read-pods
  namespace: production
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: Role
  name: pod-reader
subjects:
- kind: ServiceAccount
  name: app-reader
  namespace: production

1.3 Auditing Permissions

After configuring RBAC, verify that permissions are correct. The following commands are useful:

# Check if a user can delete pods
kubectl auth can-i delete pods --as=developer -n production
# List all permissions for a user
kubectl auth can-i --list --as=developer -n dev
# Simple script to find high‑privilege bindings
#!/bin/bash
echo "=== Checking high‑privilege role bindings ==="
kubectl get clusterrolebindings -o json | jq -r '.items[] | select(.roleRef.name=="cluster-admin") | .metadata.name + ": " + (.subjects[]|.name)'

2. Network Policies – Building a Zero‑Trust Network

2.1 Default‑Deny All Traffic

Start by denying all inbound and outbound traffic, then open only what is required.

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: default-deny-all
  namespace: production
spec:
  podSelector: {}
  policyTypes:
  - Ingress
  - Egress

2.2 Fine‑Grained Traffic Control

Example: front‑end pods can only talk to back‑end API, back‑end can only talk to the database, and the database only accepts traffic from back‑end.

# Front‑end → Back‑end API
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: frontend-to-backend
  namespace: production
spec:
  podSelector:
    matchLabels:
      app: backend
  policyTypes:
  - Ingress
  ingress:
  - from:
    - podSelector:
        matchLabels:
          app: frontend
    ports:
    - protocol: TCP
      port: 8080
---
# Back‑end → Database
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: backend-to-database
  namespace: production
spec:
  podSelector:
    matchLabels:
      app: database
  policyTypes:
  - Ingress
  ingress:
  - from:
    - podSelector:
        matchLabels:
          app: backend
    namespaceSelector:
      matchLabels:
        environment: production
    ports:
    - protocol: TCP
      port: 3306

2.3 Cross‑Namespace Communication Control

Production workloads should never be reachable from development namespaces.

# Allow only specific namespaces to access a service
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: cross-namespace-policy
  namespace: shared-services
spec:
  podSelector:
    matchLabels:
      app: logging-service
  ingress:
  - from:
    - namespaceSelector:
        matchLabels:
          environment: production
    - namespaceSelector:
        matchLabels:
          environment: staging
    ports:
    - protocol: TCP
      port: 9200

3. Real‑World Pitfalls

3.1 RBAC Misconfiguration Causing Outage

Scenario: A developer was given read‑only access to view logs, but ops granted cluster‑admin rights. The developer accidentally deleted a critical ConfigMap.

Use read‑only roles for log inspection.

Enforce all production changes through CI/CD pipelines, not direct kubectl commands.

Regularly audit and revoke temporary permissions.

3.2 NetworkPolicy Mistake Leading to Service Disruption

Scenario: A NetworkPolicy omitted DNS (port 53), causing pods to fail name resolution.

Correct configuration:

# Allow DNS resolution
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: allow-dns
spec:
  podSelector: {}
  policyTypes:
  - Egress
  egress:
  - to:
    - namespaceSelector:
        matchLabels:
          name: kube-system
    ports:
    - protocol: UDP
      port: 53

3.3 Monitoring and Alerting

After hardening, set up continuous monitoring for suspicious activity.

# Watch for unauthorized API calls
kubectl logs -n kube-system kube-apiserver-master | grep "Unauthorized"
# Deploy Falco for runtime security
helm install falco falcosecurity/falco \
  --set falco.grpc.enabled=true \
  --set falco.grpcOutput.enabled=true

4. Security Hardening Checklist

RBAC Checklist

Remove unnecessary cluster-admin bindings.

Create a dedicated ServiceAccount for each application.

Apply the principle of least privilege.

Audit permission assignments regularly.

Disable anonymous access.

Enable audit logging.

Network Policy Checklist

Implement a default‑deny policy.

Restrict cross‑namespace communication.

Protect system components in kube-system.

Allow necessary DNS resolution.

Restrict egress to known services.

Test policy effectiveness regularly.

Additional Hardening Measures

Enable Pod Security Standards.

Use admission controllers such as OPA/Gatekeeper.

Keep Kubernetes versions up to date.

Scan container images for vulnerabilities.

Encrypt etcd data.

Use TLS/mTLS for network encryption.

5. Automated Security Compliance Checks

Below is a Bash script that automates common security audits.

#!/bin/bash
# K8s security audit script

echo "======= K8s Security Audit ======="

# Check for high‑privilege accounts
echo "[*] Checking cluster‑admin bindings..."
kubectl get clusterrolebindings -o json | \
  jq '.items[] | select(.roleRef.name=="cluster-admin") | .metadata.name'

# Check default ServiceAccount usage
echo "[*] Checking default SA usage..."
kubectl get pods --all-namespaces -o json | \
  jq '.items[] | select(.spec.serviceAccount=="default") | .metadata.namespace + "/" + .metadata.name'

# Find namespaces without NetworkPolicy
echo "[*] Namespaces without NetworkPolicy..."
for ns in $(kubectl get ns -o name | cut -d/ -f2); do
  policies=$(kubectl get networkpolicy -n $ns 2>/dev/null | wc -l)
  if [ $policies -eq 0 ]; then
    echo "  - $ns: No NetworkPolicy found!"
  fi
done

# Check for privileged containers
echo "[*] Checking privileged containers..."
kubectl get pods --all-namespaces -o json | \
  jq '.items[] | select(.spec.containers[].securityContext.privileged==true) | .metadata.namespace + "/" + .metadata.name'

6. Recommended Tools

Kubescape – scans manifests against the NSA hardening guide. Polaris – checks for Kubernetes best‑practice configuration. Kube‑bench – runs CIS benchmark tests on the cluster.

Conclusion – Security Is a Marathon

Hardening a Kubernetes cluster is an ongoing effort. Start with RBAC, then incrementally apply network policies, automate audits, and regularly rehearse incident response. The cost of a security breach far exceeds the investment in preventive measures.

KubernetessecurityRBACProductionNetworkPolicy
Raymond Ops
Written by

Raymond Ops

Linux ops automation, cloud-native, Kubernetes, SRE, DevOps, Python, Golang and related tech discussions.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.