Mastering Kubernetes Component Troubleshooting with pprof and Log Analysis
Learn a systematic approach to diagnosing Kubernetes core component issues by identifying faulty nodes, analyzing logs via systemd or static pods, and leveraging Go's pprof tool for performance profiling, including step‑by‑step commands and UI visualizations for components like kube‑apiserver, scheduler, controller‑manager, and kubelet.
Kubernetes's core components are like a house foundation; their importance is obvious. As a cluster maintainer, you often encounter component issues. This article outlines a concise troubleshooting workflow.
Identify faulty nodes or components via cluster status.
Analyze component logs.
Use pprof for performance analysis.
Define Scope
The set of core components is small and simple to deploy. For example, when running kubectl get nodes, a node showing NotReady suggests either a kubelet problem or a network issue, guiding the initial direction for elimination.
We adopt a “hypothesize then verify” method, listing possible factors and checking them one by one until the issue is resolved.
Log Analysis
Log inspection is the most direct way to troubleshoot. Component logs can be viewed in two ways:
For services started by systemd: journalctl -l -u <service> For static pod services: kubectl logs -n kube-system $PODNAME --tail 100 Additionally, monitor surrounding infrastructure metrics such as CPU, memory, and I/O.
Performance Analysis
Performance profiling is placed last because it requires time and knowledge of metrics. Kubernetes releases frequently, and bugs or performance regressions may appear. Go’s pprof tool and go‑torch can generate flame graphs for deeper insight.
All components expose pprof endpoints, e.g., host:port/debug/pprof/.
Common pprof Commands
Interactive
View stack traces:
go tool pprof http://localhost:8001/debug/pprof/heapCollect 30‑second CPU profile:
go tool pprof http://localhost:8001/debug/pprof/profile?seconds=30Show goroutine blocking:
go tool pprof http://localhost:8001/debug/pprof/blockCollect 5‑second execution trace:
go tool pprof http://localhost:8001/debug/pprof/trace?seconds=5Mutex holder stack trace:
go tool pprof http://localhost:8001/debug/pprof/mutexUI Interface
Export a profile file then serve it with go tool for graphical analysis.
Example for kube‑scheduler:
curl -sK -v http://localhost:10251/debug/pprof/heap > heap.out go tool pprof -http=0.0.0.0:8989 heap.outThe UI provides menus such as VIEW (Top, Graph, Flame Graph, Peek, Source, Disassemble), SAMPLE (alloc_objects, alloc_space, inuse_objects, inuse_space), and REFINE for filtering.
Note: Some Kubernetes versions disable pprof by default; enable it with profiling: true in the component’s configuration.
Analyzing Specific Components
kube‑apiserver
kubectl proxy curl -sK -v http://localhost:8001/debug/pprof/profile > apiserver-cpu.out go tool pprof -http=0.0.0.0:8989 apiserver-cpu.outkube‑scheduler
curl -sK -v http://localhost:10251/debug/pprof/profile > scheduler-cpu.out go tool pprof -http=0.0.0:8989 scheduler-cpu.outkube‑controller‑manager
curl -sK -v http://localhost:10252/debug/pprof/profile > controller-cpu.out go tool pprof -http=0.0.0.0:8989 controller-cpu.outkubelet
kubectl proxy curl -sK -v http://127.0.0.1:8001/api/v1/nodes/k8s-node04-138/proxy/debug/pprof/profile > kubelet-cpu.out go tool pprof -http=0.0.0.0:8989 kubelet-cpu.outCapturing performance data is the first step; subsequent analysis helps pinpoint the root cause.
References
https://github.com/google/pprof
https://github.com/uber-archive/go-torch
http://www.graphviz.org/download/#linux
https://kubernetes.io/zh/docs/reference/command-line-tools-reference/kube-apiserver/
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Ops Development Stories
Maintained by a like‑minded team, covering both operations and development. Topics span Linux ops, DevOps toolchain, Kubernetes containerization, monitoring, log collection, network security, and Python or Go development. Team members: Qiao Ke, wanger, Dong Ge, Su Xin, Hua Zai, Zheng Ge, Teacher Xia.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
