Mastering Secure and Scalable Kubernetes Deployments: Essential Best Practices
This guide outlines practical Kubernetes best practices—including health checks, graceful shutdown, resource limits, security policies, network policies, RBAC, autoscaling, and logging—to help you build secure, resilient, and efficiently managed services on a cloud‑native platform.
Application Development
Health Checks
Configure a Readiness probe so that kubelet only routes traffic to a pod after the application signals it is ready; otherwise requests will fail during start‑up. Configure a Liveness probe to restart containers that become unresponsive, but never use it to handle fatal errors—let the process crash and be restarted by kubelet.
Uncaught exceptions
Typos in dynamic‑language code
Failure to load headers or dependencies
Fatal errors should cause the process to exit immediately, not to signal a Liveness failure.
Application Independence
Readiness probes must be independent of external services (databases, APIs, etc.). Applications should keep retrying connections to dependent services until they succeed, allowing the cluster to start pods in any order.
Graceful Shutdown
When a pod receives SIGTERM, it should stop accepting new requests, finish processing in‑flight requests, and close idle keep‑alive sockets before exiting. Use a preStop handler if additional cleanup is required.
Failure Tolerance
Deploy applications as part of a Deployment , DaemonSet , ReplicaSet or StatefulSet with multiple replicas, and spread them across nodes using anti‑affinity rules. Define a PodDisruptionBudget to protect against accidental mass pod deletions.
Resource Usage
Set memory requests and limits for every container; the scheduler uses these values to place pods. If a container exceeds its memory limit it is terminated; CPU is compressible, so exceeding a CPU limit throttles the process. For workloads without heavy CPU demand, request ≤ 1 CPU. Consider using a LimitRange in a namespace to enforce default limits, and configure appropriate QoS classes.
Labeling Resources
Technical labels
Business labels
Security labels
Log Configuration
Write application logs to stdout / stderr (passive logging). Avoid sidecar log collectors unless the application cannot emit logs in the required format. Prefer a centralized log aggregation stack (e.g., EFK, Datadog, Sumo Logic, Sysdig, Cloud‑specific solutions).
Pod Autoscaling
Never store state on a container’s local filesystem; use PersistentVolumes or external storage. Enable the Horizontal Pod Autoscaler (HPA) for workloads with variable traffic, defining metrics (CPU, memory, or custom via Prometheus). The Vertical Pod Autoscaler (VPA) is still beta and should not be used in production.
Cluster Management
Namespace Limits
Use LimitRange to set default request/limit values and ResourceQuota to cap total CPU, memory, and storage consumption per namespace.
Pod Security Policies
Restrict host process/network namespace access
Disallow privileged containers
Enforce non‑root user execution
Limit Linux capabilities to the minimum required
Prevent privilege escalation
Reference: Kubernetes Pod Security Policy documentation.
Network Policies
Enable NetworkPolicy to create firewall‑like rules between pod groups. Store example YAML files in a repository for common use cases.
RBAC Policies
Disable automatic mounting of the default ServiceAccount token, grant the least privilege required, and define distinct roles such as ReadOnly, PowerUser, Operator, Controller, and Admin.
Custom Policies
Allow container images only from approved registries (OPA policy agent).
Enforce unique Ingress hostnames and restrict them to approved domains.
Cluster Configuration
Cluster Requirements
Run CIS Kubernetes benchmark checks with kube‑bench. Note that managed clusters (GKE, EKS, AKS) have control‑plane components managed by the cloud provider.
Metadata API
Disable cloud‑provider metadata APIs on worker nodes to prevent pods from accessing cloud credentials.
Alpha/Beta Feature Restrictions
Avoid enabling alpha or beta features in production unless their value outweighs the security risk; disable unused features.
Identity
OpenID Connect (OIDC)
Use OIDC tokens for single‑sign‑on authentication to the cluster, allowing multiple clusters to trust the same identity provider.
ServiceAccount Tokens
ServiceAccount tokens are intended for applications and controllers, not for end‑users.
Logging Setup
Retention and Aggregation
Retain logs for 30‑45 days, collect logs from nodes (kubelet, container runtime), control plane components, and audit logs. Include metadata such as application name, instance, version, pod name, namespace, and node ID. Deploy a daemonset on each node to gather logs, then forward them to a centralized aggregation system.
References:
handling-client-requests-properly-with-kubernetes/
graceful-shutdown-of-kubernetes-pods
understanding-resource-limits-in-kubernetes-memory
understanding-resource-limits-in-kubernetes-cpu
limit-range
quality-service-pod
resource-quotas
kubernetes-pod-security-policy
security-context
running-docker-containers-securely-in-production
processes-in-containers-should-not-run-as-root
Linux Capabilities: Why They Exist and How They Work
Linux Capabilities In Practice
Securing Kubernetes Cluster Networking
access-control-roles-and-service-accounts
kubernetes-single-sign-one-less-identity
kube-bench
inter-pod-affinity-and-anti-affinit
pod-disruptions
understanding-resource-limits-in-kubernetes-cpu
gracefully-shutting-down-a-nodejs-http-server
logs
EFK stack, DataDog, Sumo Logic, Sysdig, GCP Stackdriver, Azure Monitor, AWS CloudWatch
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
dbaplus Community
Enterprise-level professional community for Database, BigData, and AIOps. Daily original articles, weekly online tech talks, monthly offline salons, and quarterly XCOPS&DAMS conferences—delivered by industry experts.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
