Cloud Native 22 min read

Mastering Secure and Scalable Kubernetes Deployments: Essential Best Practices

This guide outlines practical Kubernetes best practices—including health checks, graceful shutdown, resource limits, security policies, network policies, RBAC, autoscaling, and logging—to help you build secure, resilient, and efficiently managed services on a cloud‑native platform.

dbaplus Community

Dec 18, 2023

Mastering Secure and Scalable Kubernetes Deployments: Essential Best Practices

Application Development

Health Checks

Configure a Readiness probe so that kubelet only routes traffic to a pod after the application signals it is ready; otherwise requests will fail during start‑up. Configure a Liveness probe to restart containers that become unresponsive, but never use it to handle fatal errors—let the process crash and be restarted by kubelet.

Uncaught exceptions

Typos in dynamic‑language code

Failure to load headers or dependencies

Fatal errors should cause the process to exit immediately, not to signal a Liveness failure.

Application Independence

Readiness probes must be independent of external services (databases, APIs, etc.). Applications should keep retrying connections to dependent services until they succeed, allowing the cluster to start pods in any order.

Graceful Shutdown

When a pod receives SIGTERM, it should stop accepting new requests, finish processing in‑flight requests, and close idle keep‑alive sockets before exiting. Use a preStop handler if additional cleanup is required.

Failure Tolerance

Deploy applications as part of a Deployment , DaemonSet , ReplicaSet or StatefulSet with multiple replicas, and spread them across nodes using anti‑affinity rules. Define a PodDisruptionBudget to protect against accidental mass pod deletions.

Resource Usage

Set memory requests and limits for every container; the scheduler uses these values to place pods. If a container exceeds its memory limit it is terminated; CPU is compressible, so exceeding a CPU limit throttles the process. For workloads without heavy CPU demand, request ≤ 1 CPU. Consider using a LimitRange in a namespace to enforce default limits, and configure appropriate QoS classes.

Labeling Resources

Technical labels

Business labels

Security labels

Log Configuration

Write application logs to stdout / stderr (passive logging). Avoid sidecar log collectors unless the application cannot emit logs in the required format. Prefer a centralized log aggregation stack (e.g., EFK, Datadog, Sumo Logic, Sysdig, Cloud‑specific solutions).

Pod Autoscaling

Never store state on a container’s local filesystem; use PersistentVolumes or external storage. Enable the Horizontal Pod Autoscaler (HPA) for workloads with variable traffic, defining metrics (CPU, memory, or custom via Prometheus). The Vertical Pod Autoscaler (VPA) is still beta and should not be used in production.

Cluster Management

Namespace Limits

Use LimitRange to set default request/limit values and ResourceQuota to cap total CPU, memory, and storage consumption per namespace.

Pod Security Policies

Restrict host process/network namespace access

Disallow privileged containers

Enforce non‑root user execution

Limit Linux capabilities to the minimum required

Prevent privilege escalation

Reference: Kubernetes Pod Security Policy documentation.

Network Policies

Enable NetworkPolicy to create firewall‑like rules between pod groups. Store example YAML files in a repository for common use cases.

RBAC Policies

Disable automatic mounting of the default ServiceAccount token, grant the least privilege required, and define distinct roles such as ReadOnly, PowerUser, Operator, Controller, and Admin.

Custom Policies

Allow container images only from approved registries (OPA policy agent).

Enforce unique Ingress hostnames and restrict them to approved domains.

Cluster Configuration

Cluster Requirements

Run CIS Kubernetes benchmark checks with kube‑bench. Note that managed clusters (GKE, EKS, AKS) have control‑plane components managed by the cloud provider.

Metadata API

Disable cloud‑provider metadata APIs on worker nodes to prevent pods from accessing cloud credentials.

Alpha/Beta Feature Restrictions

Avoid enabling alpha or beta features in production unless their value outweighs the security risk; disable unused features.

Identity

OpenID Connect (OIDC)

Use OIDC tokens for single‑sign‑on authentication to the cluster, allowing multiple clusters to trust the same identity provider.

ServiceAccount Tokens

ServiceAccount tokens are intended for applications and controllers, not for end‑users.

Logging Setup

Retention and Aggregation

Retain logs for 30‑45 days, collect logs from nodes (kubelet, container runtime), control plane components, and audit logs. Include metadata such as application name, instance, version, pod name, namespace, and node ID. Deploy a daemonset on each node to gather logs, then forward them to a centralized aggregation system.

References:

handling-client-requests-properly-with-kubernetes/

graceful-shutdown-of-kubernetes-pods

understanding-resource-limits-in-kubernetes-memory

understanding-resource-limits-in-kubernetes-cpu

limit-range

quality-service-pod

resource-quotas

kubernetes-pod-security-policy

security-context

running-docker-containers-securely-in-production

processes-in-containers-should-not-run-as-root

Linux Capabilities: Why They Exist and How They Work

Linux Capabilities In Practice

Securing Kubernetes Cluster Networking

access-control-roles-and-service-accounts

kubernetes-single-sign-one-less-identity

kube-bench

inter-pod-affinity-and-anti-affinit

pod-disruptions

understanding-resource-limits-in-kubernetes-cpu

gracefully-shutting-down-a-nodejs-http-server

logs

EFK stack, DataDog, Sumo Logic, Sysdig, GCP Stackdriver, Azure Monitor, AWS CloudWatch

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Cloud Native kubernetes devops security scaling

Written by

dbaplus Community

Enterprise-level professional community for Database, BigData, and AIOps. Daily original articles, weekly online tech talks, monthly offline salons, and quarterly XCOPS&DAMS conferences—delivered by industry experts.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.