Cloud Native 14 min read

Essential Kubernetes Best Practices Every Engineer Should Follow

This article presents a comprehensive set of Kubernetes best practices covering namespace usage, health probes, autoscaling, resource requests, workload controllers, multi‑node clusters, RBAC, managed services, version upgrades, monitoring, GitOps, image optimization, labeling, network policies, and firewalls to help engineers design, operate, and maintain robust cloud‑native environments.

Ops Development Stories
Ops Development Stories
Ops Development Stories
Essential Kubernetes Best Practices Every Engineer Should Follow

This article translates Jack Roper's "Kubernetes Best Practice" and shares practical guidance for using Kubernetes (K8s) effectively.

Best Practice Index

Use namespaces

Use readiness and liveness probes (including startup probes)

Use autoscaling

Use resource requests and limits

Deploy Pods with Deployment, DaemonSet, ReplicaSet, or StatefulSet across nodes

Use multiple nodes

Use role‑based access control (RBAC)

Host clusters externally (cloud services)

Upgrade Kubernetes versions

Monitor cluster resources and audit logs

Use version control systems

Adopt Git‑based workflows (GitOps)

Reduce container image size

Organize objects with labels

Use network policies

Use firewalls

Use Namespaces

Namespaces are crucial for organizing objects, creating logical partitions, and enhancing security. By default, a cluster includes default, kube-public, and kube-system namespaces.

RBAC can restrict access to specific namespaces, limiting the blast radius of potential errors. For example, a development team might only access the dev namespace and be barred from production. This isolation helps avoid conflicts and duplicate work.

Namespaces can also be configured with LimitRange to define standard container sizes, ResourceQuotas to cap total resource consumption, and network policies to control pod‑to‑pod traffic.

Use Readiness and Liveness Probes

Readiness and liveness probes are health‑check mechanisms. A readiness probe ensures traffic is only sent to pods that are ready to serve, preventing premature requests during startup. Liveness probes detect unresponsive applications, prompting the kubelet to restart the pod.

Since Kubernetes 1.18, a startup probe can be used for containers with long initialization times; if it fails, other probes are ignored.

Define probes for all containers within a pod.

Use Autoscaling

Autoscaling can dynamically adjust the number of pods (Horizontal Pod Autoscaler, HPA), pod resource requests (Vertical Pod Autoscaler, VPA), or cluster nodes (Cluster Autoscaler, CA) based on workload demand.

Horizontal scaling may require using PersistentVolumes for stateful data, as local storage does not survive pod recreation.

Cluster autoscaling is valuable for highly variable workloads and can reduce costs by removing idle nodes.

Use Resource Requests and Limits

Set resource requests and limits to ensure containers receive the necessary CPU and memory and to prevent a pod from exhausting cluster resources.

Without limits, pods may consume excess resources, causing other applications to suffer or nodes to crash.

If a container exceeds its memory limit, it is terminated; exceeding CPU limits throttles the process.

Deploy Pods with Controllers

Never run a pod directly. Use Deployment, DaemonSet, ReplicaSet, or StatefulSet to improve fault tolerance. Anti‑affinity rules can spread pods across nodes to avoid a single point of failure.

Use Multiple Nodes

Running a single‑node cluster reduces fault tolerance. Distribute workloads across multiple nodes to increase resilience.

Use Role‑Based Access Control (RBAC)

RBAC secures the cluster by assigning permissions to users, groups, or service accounts at the namespace (Role) or cluster (ClusterRole) level. Bind roles with RoleBinding or ClusterRoleBinding.

Apply the principle of least privilege: grant only the permissions required for a role.

Host Clusters Externally (Cloud Services)

Managed cloud services like Azure AKS or AWS EKS handle the underlying infrastructure, simplifying node scaling and reducing operational overhead.

Upgrade Kubernetes Versions

New releases bring features, security patches, and bug fixes. Upgrading is essential, but verify compatibility of workloads and be aware of deprecated APIs.

Monitor Cluster Resources and Audit Logs

Monitor control‑plane components (API server, kubelet, etcd, controller‑manager, kube‑proxy, kube‑dns) using Prometheus‑compatible metrics.

Enable audit logging in the API server to record all requests; audit policies are defined in audit-policy.yaml and can be customized.

Use automated alerting and retain logs for 30‑45 days. Integrate with tools like Azure Monitor, AWS CloudWatch, Dynatrace, or Datadog.

Use Version Control Systems

Store Kubernetes manifests in a VCS to enable change audit trails, enforce review processes, and improve cluster stability.

Adopt Git‑Based Workflows (GitOps)

GitOps leverages CI/CD pipelines to automate deployments, providing auditability and a single source of truth for cluster configuration.

Reduce Container Image Size

Smaller images speed up builds, deployments, and reduce resource consumption. Use minimal base images like Alpine and remove unnecessary packages.

Smaller images also reduce the attack surface.

Organize Objects with Labels

Labels are key‑value pairs that help organize and query resources. Recommended pod labels include name, instance, version, component, part, and managed‑by.

Labels can also convey security requirements such as confidentiality and compliance.

Use Network Policies

Network policies restrict traffic between objects at the IP and port level, similar to cloud security groups. Default‑deny all traffic and then allow only required flows.

Use Firewalls

Place a firewall in front of the API server to whitelist IPs and limit exposed ports, reducing external attack vectors.

Conclusion

Following the best practices outlined in this article will help you design, operate, and maintain Kubernetes clusters successfully on your modern application journey.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Cloud NativeKubernetesDevOpscontainer orchestration
Ops Development Stories
Written by

Ops Development Stories

Maintained by a like‑minded team, covering both operations and development. Topics span Linux ops, DevOps toolchain, Kubernetes containerization, monitoring, log collection, network security, and Python or Go development. Team members: Qiao Ke, wanger, Dong Ge, Su Xin, Hua Zai, Zheng Ge, Teacher Xia.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.