Master Kubernetes Troubleshooting: 100 Essential kubectl Commands
This guide compiles 100 practical kubectl commands that help you diagnose cluster information, pods, services, deployments, networking, storage, security, autoscaling, and many other Kubernetes components, providing a handy reference for effective cluster troubleshooting.
This guide compiles 100 practical kubectl commands for diagnosing a Kubernetes cluster, covering everything from basic cluster information to advanced debugging techniques.
Cluster Information
kubectl version– Show Kubernetes version. kubectl cluster-info – Display cluster information. kubectl get nodes – List all nodes. kubectl describe node <node-name> – Show details of a specific node. kubectl get namespaces – List all namespaces. kubectl get pods --all-namespaces – List all pods across namespaces.
Pod Diagnosis
kubectl get pods -n <namespace>– List pods in a namespace. kubectl describe pod <pod-name> -n <namespace> – Show pod details. kubectl logs <pod-name> -n <namespace> – View pod logs. kubectl logs -f <pod-name> -n <namespace> – Follow pod logs.
kubectl exec -it <pod-name> -n <namespace> -- <command>– Execute a command inside a pod.
Pod Health Checks
kubectl get pods <pod-name> -n <namespace> -o jsonpath='{.status.conditions[?(@.type=="Ready")].status}'– Check readiness.
kubectl get events -n <namespace> --field-selector involvedObject.name=<pod-name>– Inspect pod events.
Service Diagnosis
kubectl get svc -n <namespace>– List services.
kubectl describe svc <service-name> -n <namespace>– Show service details.
Deployment Diagnosis
kubectl get deployments -n <namespace>– List deployments.
kubectl describe deployment <deployment-name> -n <namespace>– Show deployment details.
kubectl rollout status deployment/<deployment-name> -n <namespace>– Check rollout status.
kubectl rollout history deployment/<deployment-name> -n <namespace>– View rollout history.
StatefulSet Diagnosis
kubectl get statefulsets -n <namespace>– List StatefulSets.
kubectl describe statefulset <statefulset-name> -n <namespace>– Show details.
ConfigMap and Secret Diagnosis
kubectl get configmaps -n <namespace>– List ConfigMaps.
kubectl describe configmap <configmap-name> -n <namespace>– Show ConfigMap details. kubectl get secrets -n <namespace> – List Secrets.
kubectl describe secret <secret-name> -n <namespace>– Show Secret details.
Namespace Diagnosis
kubectl describe namespace <namespace-name>– Show namespace details.
Resource Usage
kubectl top pod <pod-name> -n <namespace>– Show pod CPU/memory. kubectl top nodes – Show node resource usage.
Network Diagnosis
kubectl get pods -n <namespace> -o custom-columns=POD:metadata.name,IP:status.podIP --no-headers– List pod IPs. kubectl get networkpolicies -n <namespace> – List network policies.
kubectl describe networkpolicy <network-policy-name> -n <namespace>– Show policy details.
Persistent Volume (PV) and Persistent Volume Claim (PVC) Diagnosis
kubectl get pv– List PVs. kubectl describe pv <pv-name> – Show PV details. kubectl get pvc -n <namespace> – List PVCs. kubectl describe pvc <pvc-name> -n <namespace> – Show PVC details.
Resource Quotas and Limits
kubectl get resourcequotas -n <namespace>– List quotas.
kubectl describe resourcequota <resource-quota-name> -n <namespace>– Show quota details.
Custom Resource Definition (CRD) Diagnosis
kubectl get <custom-resource-name> -n <namespace>– List custom resources.
kubectl describe <custom-resource-name> <custom-resource-instance-name> -n <namespace>– Show details.
Autoscaling
kubectl scale deployment <deployment-name> --replicas=<replica-count> -n <namespace>– Manually scale.
kubectl autoscale deployment <deployment-name> --min=<min-pods> --max=<max-pods> --cpu-percent=<cpu-percent> -n <namespace>– Enable HPA. kubectl get hpa -n <namespace> – Check HPA status.
Job and CronJob Diagnosis
kubectl get jobs -n <namespace>– List jobs. kubectl describe job <job-name> -n <namespace> – Show job details. kubectl get cronjobs -n <namespace> – List CronJobs.
kubectl describe cronjob <cronjob-name> -n <namespace>– Show CronJob details.
Capacity Diagnosis
kubectl get pv --sort-by=.spec.capacity.storage– List PVs by capacity.
kubectl get pv <pv-name> -o=jsonpath='{.spec.persistentVolumeReclaimPolicy}'– Show reclaim policy. kubectl get storageclasses – List storage classes.
Ingress and Service Mesh Diagnosis
kubectl get ingress -n <namespace>– List Ingresses.
kubectl describe ingress <ingress-name> -n <namespace>– Show Ingress details. kubectl get virtualservices -n <namespace> – List Istio VirtualServices.
kubectl describe virtualservice <virtualservice-name> -n <namespace>– Show VirtualService details.
Pod Network Troubleshooting
kubectl run -it --rm --restart=Never --image=busybox net-debug-pod -- /bin/sh– Start a debug pod.
kubectl exec -it <pod-name> -n <namespace> -- curl <endpoint-url>– Test connectivity.
kubectl exec -it <source-pod-name> -n <namespace> -- traceroute <destination-pod-ip>– Trace network path.
kubectl exec -it <pod-name> -n <namespace> -- nslookup <domain-name>– Check DNS.
Configuration and Resource Validation
kubectl apply --dry-run=client -f <yaml-file>– Validate YAML without applying.
kubectl auth can-i list pods --as=system:serviceaccount:<namespace>:<serviceaccount-name>– Verify service account permissions.
RBAC and Security
kubectl get roles,rolebindings -n <namespace>– List roles and bindings. kubectl describe role <role-name> -n <namespace> – Show role details.
Service Account Diagnosis
kubectl get serviceaccounts -n <namespace>– List service accounts.
kubectl describe serviceaccount <serviceaccount-name> -n <namespace>– Show account details.
Node Drain and Uncordon
kubectl drain <node-name> --ignore-daemonsets– Drain node for maintenance. kubectl uncordon <node-name> – Uncordon node.
Resource Cleanup
kubectl delete pod <pod-name> -n <namespace> --grace-period=0 --force– Force delete a pod (use with caution).
Pod Affinity and Anti‑Affinity
kubectl get pod <pod-name> -n <namespace> -o=jsonpath='{.spec.affinity}'– Show affinity rules.
kubectl get pod <pod-name> -n <namespace> -o=jsonpath='{.spec.affinity.podAntiAffinity}'– Show anti‑affinity rules.
Pod Security Policy (PSP)
kubectl get psp– List PSPs (if enabled).
Events
kubectl get events --sort-by=.metadata.creationTimestamp– List recent cluster events. kubectl get events -n <namespace> – Filter events by namespace.
Node Troubleshooting
kubectl describe node <node-name> | grep Conditions -A5– Check node conditions.
kubectl describe node <node-name> | grep -E "Capacity|Allocatable"– View capacity and allocatable resources.
Temporary Containers (Kubernetes 1.18+)
kubectl debug -it <pod-name> -n <namespace> --image=<debug-image> -- /bin/sh– Run a temporary debug container.
Resource Metrics (requires metrics‑server)
kubectl top pod -n <namespace>– Show pod CPU/memory usage.
Kubelet Logs
kubectl logs -n kube-system kubelet-<node-name>– View kubelet logs on a node.
Advanced Debugging with Telepresence
telepresence --namespace <namespace> --swap-deployment <pod-name>– Debug a pod using Telepresence.
Kubeconfig and Context
kubectl config get-contexts– List available contexts. kubectl config use-context <context-name> – Switch context.
Pod Security Standards (PodSecurity Admission)
kubectl get psp -A | grep -vE 'NAME|REVIEWED'– List PSP violations.
Pod Disruption Budget (PDB) Diagnosis
kubectl get pdb -n <namespace>– List PDBs. kubectl describe pdb <pdb-name> -n <namespace> – Show PDB details.
Resource Lock Diagnosis (if used)
kubectl get resourcelocks -n <namespace>– List resource locks.
Service Endpoints and DNS
kubectl get endpoints <service-name> -n <namespace>– List service endpoints.
kubectl exec -it <pod-name> -n <namespace> -- cat /etc/resolv.conf– Inspect DNS configuration inside a pod.
Custom Metrics (Prometheus, Grafana)
kubectl port-forward– Forward ports to access Prometheus/Grafana for custom metric queries.
Pod Priority and Preemption
kubectl get priorityclasses– List priority classes.
Pod Overhead (Kubernetes 1.18+)
kubectl get pod <pod-name> -n <namespace> -o=jsonpath='{.spec.overhead}'– Show pod overhead.
Volume Snapshot Diagnosis (if used)
kubectl get volumesnapshot -n <namespace>– List volume snapshots.
kubectl describe volumesnapshot <snapshot-name> -n <namespace>– Show snapshot details.
Resource Deserialization Diagnosis
kubectl get <resource-type> <resource-name> -n <namespace> -o=json– Retrieve and print a resource as JSON.
Node Taints
kubectl describe node <node-name> | grep Taints– List node taints.
Webhook Configuration
kubectl get mutatingwebhookconfigurations– List mutating webhooks. kubectl get validatingwebhookconfigurations – List validating webhooks.
Pod Network Policies
kubectl get networkpolicies -n <namespace>– List pod network policies.
Node Conditions (Kubernetes 1.17+)
kubectl get nodes -o custom-columns=NODE:.metadata.name,READY:.status.conditions[?(@.type=="Ready")].status -l 'node-role.kubernetes.io/worker='– Custom query for node readiness.
Audit Logs
Check the cluster’s audit log configuration and retrieve logs if audit logging is enabled.
Node Operating System Details
kubectl get node <node-name> -o jsonpath='{.status.nodeInfo.osImage}'– Get OS image of a node.
Replace placeholders such as <namespace>, <pod-name>, <deployment-name>, etc., with actual values from your cluster.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Liangxu Linux
Liangxu, a self‑taught IT professional now working as a Linux development engineer at a Fortune 500 multinational, shares extensive Linux knowledge—fundamentals, applications, tools, plus Git, databases, Raspberry Pi, etc. (Reply “Linux” to receive essential resources.)
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
