Cloud Native 19 min read

Master Kubernetes Troubleshooting: 100 Essential kubectl Commands

This guide compiles 100 practical kubectl commands that help you diagnose cluster information, pods, services, deployments, networking, storage, security, autoscaling, and many other Kubernetes components, providing a handy reference for effective cluster troubleshooting.

Liangxu Linux
Liangxu Linux
Liangxu Linux
Master Kubernetes Troubleshooting: 100 Essential kubectl Commands

This guide compiles 100 practical kubectl commands for diagnosing a Kubernetes cluster, covering everything from basic cluster information to advanced debugging techniques.

Cluster Information

kubectl version

– Show Kubernetes version. kubectl cluster-info – Display cluster information. kubectl get nodes – List all nodes. kubectl describe node <node-name> – Show details of a specific node. kubectl get namespaces – List all namespaces. kubectl get pods --all-namespaces – List all pods across namespaces.

Pod Diagnosis

kubectl get pods -n <namespace>

– List pods in a namespace. kubectl describe pod <pod-name> -n <namespace> – Show pod details. kubectl logs <pod-name> -n <namespace> – View pod logs. kubectl logs -f <pod-name> -n <namespace> – Follow pod logs.

kubectl exec -it <pod-name> -n <namespace> -- <command>

– Execute a command inside a pod.

Pod Health Checks

kubectl get pods <pod-name> -n <namespace> -o jsonpath='{.status.conditions[?(@.type=="Ready")].status}'

– Check readiness.

kubectl get events -n <namespace> --field-selector involvedObject.name=<pod-name>

– Inspect pod events.

Service Diagnosis

kubectl get svc -n <namespace>

– List services.

kubectl describe svc <service-name> -n <namespace>

– Show service details.

Deployment Diagnosis

kubectl get deployments -n <namespace>

– List deployments.

kubectl describe deployment <deployment-name> -n <namespace>

– Show deployment details.

kubectl rollout status deployment/<deployment-name> -n <namespace>

– Check rollout status.

kubectl rollout history deployment/<deployment-name> -n <namespace>

– View rollout history.

StatefulSet Diagnosis

kubectl get statefulsets -n <namespace>

– List StatefulSets.

kubectl describe statefulset <statefulset-name> -n <namespace>

– Show details.

ConfigMap and Secret Diagnosis

kubectl get configmaps -n <namespace>

– List ConfigMaps.

kubectl describe configmap <configmap-name> -n <namespace>

– Show ConfigMap details. kubectl get secrets -n <namespace> – List Secrets.

kubectl describe secret <secret-name> -n <namespace>

– Show Secret details.

Namespace Diagnosis

kubectl describe namespace <namespace-name>

– Show namespace details.

Resource Usage

kubectl top pod <pod-name> -n <namespace>

– Show pod CPU/memory. kubectl top nodes – Show node resource usage.

Network Diagnosis

kubectl get pods -n <namespace> -o custom-columns=POD:metadata.name,IP:status.podIP --no-headers

– List pod IPs. kubectl get networkpolicies -n <namespace> – List network policies.

kubectl describe networkpolicy <network-policy-name> -n <namespace>

– Show policy details.

Persistent Volume (PV) and Persistent Volume Claim (PVC) Diagnosis

kubectl get pv

– List PVs. kubectl describe pv <pv-name> – Show PV details. kubectl get pvc -n <namespace> – List PVCs. kubectl describe pvc <pvc-name> -n <namespace> – Show PVC details.

Resource Quotas and Limits

kubectl get resourcequotas -n <namespace>

– List quotas.

kubectl describe resourcequota <resource-quota-name> -n <namespace>

– Show quota details.

Custom Resource Definition (CRD) Diagnosis

kubectl get <custom-resource-name> -n <namespace>

– List custom resources.

kubectl describe <custom-resource-name> <custom-resource-instance-name> -n <namespace>

– Show details.

Autoscaling

kubectl scale deployment <deployment-name> --replicas=<replica-count> -n <namespace>

– Manually scale.

kubectl autoscale deployment <deployment-name> --min=<min-pods> --max=<max-pods> --cpu-percent=<cpu-percent> -n <namespace>

– Enable HPA. kubectl get hpa -n <namespace> – Check HPA status.

Job and CronJob Diagnosis

kubectl get jobs -n <namespace>

– List jobs. kubectl describe job <job-name> -n <namespace> – Show job details. kubectl get cronjobs -n <namespace> – List CronJobs.

kubectl describe cronjob <cronjob-name> -n <namespace>

– Show CronJob details.

Capacity Diagnosis

kubectl get pv --sort-by=.spec.capacity.storage

– List PVs by capacity.

kubectl get pv <pv-name> -o=jsonpath='{.spec.persistentVolumeReclaimPolicy}'

– Show reclaim policy. kubectl get storageclasses – List storage classes.

Ingress and Service Mesh Diagnosis

kubectl get ingress -n <namespace>

– List Ingresses.

kubectl describe ingress <ingress-name> -n <namespace>

– Show Ingress details. kubectl get virtualservices -n <namespace> – List Istio VirtualServices.

kubectl describe virtualservice <virtualservice-name> -n <namespace>

– Show VirtualService details.

Pod Network Troubleshooting

kubectl run -it --rm --restart=Never --image=busybox net-debug-pod -- /bin/sh

– Start a debug pod.

kubectl exec -it <pod-name> -n <namespace> -- curl <endpoint-url>

– Test connectivity.

kubectl exec -it <source-pod-name> -n <namespace> -- traceroute <destination-pod-ip>

– Trace network path.

kubectl exec -it <pod-name> -n <namespace> -- nslookup <domain-name>

– Check DNS.

Configuration and Resource Validation

kubectl apply --dry-run=client -f <yaml-file>

– Validate YAML without applying.

kubectl auth can-i list pods --as=system:serviceaccount:<namespace>:<serviceaccount-name>

– Verify service account permissions.

RBAC and Security

kubectl get roles,rolebindings -n <namespace>

– List roles and bindings. kubectl describe role <role-name> -n <namespace> – Show role details.

Service Account Diagnosis

kubectl get serviceaccounts -n <namespace>

– List service accounts.

kubectl describe serviceaccount <serviceaccount-name> -n <namespace>

– Show account details.

Node Drain and Uncordon

kubectl drain <node-name> --ignore-daemonsets

– Drain node for maintenance. kubectl uncordon <node-name> – Uncordon node.

Resource Cleanup

kubectl delete pod <pod-name> -n <namespace> --grace-period=0 --force

– Force delete a pod (use with caution).

Pod Affinity and Anti‑Affinity

kubectl get pod <pod-name> -n <namespace> -o=jsonpath='{.spec.affinity}'

– Show affinity rules.

kubectl get pod <pod-name> -n <namespace> -o=jsonpath='{.spec.affinity.podAntiAffinity}'

– Show anti‑affinity rules.

Pod Security Policy (PSP)

kubectl get psp

– List PSPs (if enabled).

Events

kubectl get events --sort-by=.metadata.creationTimestamp

– List recent cluster events. kubectl get events -n <namespace> – Filter events by namespace.

Node Troubleshooting

kubectl describe node <node-name> | grep Conditions -A5

– Check node conditions.

kubectl describe node <node-name> | grep -E "Capacity|Allocatable"

– View capacity and allocatable resources.

Temporary Containers (Kubernetes 1.18+)

kubectl debug -it <pod-name> -n <namespace> --image=<debug-image> -- /bin/sh

– Run a temporary debug container.

Resource Metrics (requires metrics‑server)

kubectl top pod -n <namespace>

– Show pod CPU/memory usage.

Kubelet Logs

kubectl logs -n kube-system kubelet-<node-name>

– View kubelet logs on a node.

Advanced Debugging with Telepresence

telepresence --namespace <namespace> --swap-deployment <pod-name>

– Debug a pod using Telepresence.

Kubeconfig and Context

kubectl config get-contexts

– List available contexts. kubectl config use-context <context-name> – Switch context.

Pod Security Standards (PodSecurity Admission)

kubectl get psp -A | grep -vE 'NAME|REVIEWED'

– List PSP violations.

Pod Disruption Budget (PDB) Diagnosis

kubectl get pdb -n <namespace>

– List PDBs. kubectl describe pdb <pdb-name> -n <namespace> – Show PDB details.

Resource Lock Diagnosis (if used)

kubectl get resourcelocks -n <namespace>

– List resource locks.

Service Endpoints and DNS

kubectl get endpoints <service-name> -n <namespace>

– List service endpoints.

kubectl exec -it <pod-name> -n <namespace> -- cat /etc/resolv.conf

– Inspect DNS configuration inside a pod.

Custom Metrics (Prometheus, Grafana)

kubectl port-forward

– Forward ports to access Prometheus/Grafana for custom metric queries.

Pod Priority and Preemption

kubectl get priorityclasses

– List priority classes.

Pod Overhead (Kubernetes 1.18+)

kubectl get pod <pod-name> -n <namespace> -o=jsonpath='{.spec.overhead}'

– Show pod overhead.

Volume Snapshot Diagnosis (if used)

kubectl get volumesnapshot -n <namespace>

– List volume snapshots.

kubectl describe volumesnapshot <snapshot-name> -n <namespace>

– Show snapshot details.

Resource Deserialization Diagnosis

kubectl get <resource-type> <resource-name> -n <namespace> -o=json

– Retrieve and print a resource as JSON.

Node Taints

kubectl describe node <node-name> | grep Taints

– List node taints.

Webhook Configuration

kubectl get mutatingwebhookconfigurations

– List mutating webhooks. kubectl get validatingwebhookconfigurations – List validating webhooks.

Pod Network Policies

kubectl get networkpolicies -n <namespace>

– List pod network policies.

Node Conditions (Kubernetes 1.17+)

kubectl get nodes -o custom-columns=NODE:.metadata.name,READY:.status.conditions[?(@.type=="Ready")].status -l 'node-role.kubernetes.io/worker='

– Custom query for node readiness.

Audit Logs

Check the cluster’s audit log configuration and retrieve logs if audit logging is enabled.

Node Operating System Details

kubectl get node <node-name> -o jsonpath='{.status.nodeInfo.osImage}'

– Get OS image of a node.

Replace placeholders such as <namespace>, <pod-name>, <deployment-name>, etc., with actual values from your cluster.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

cloud-nativeKubernetesClusterdiagnosticscommandskubectl
Liangxu Linux
Written by

Liangxu Linux

Liangxu, a self‑taught IT professional now working as a Linux development engineer at a Fortune 500 multinational, shares extensive Linux knowledge—fundamentals, applications, tools, plus Git, databases, Raspberry Pi, etc. (Reply “Linux” to receive essential resources.)

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.