Fix Stuck Kubernetes Resources, ETCD Errors, and ServiceAccount Issues
This guide walks through troubleshooting common Kubernetes issues such as deleting stuck RCs, Deployments, and Services, resetting etcd after failures, fixing apiserver start errors caused by missing ServiceAccount certificates, handling SELinux permission denials, configuring host trust, and force‑deleting problematic Pods or Namespaces.
How to delete resources in inconsistent state
When kubectl hangs and only part of a resource is removed, you can force delete the remaining RC, Deployment, or Service:
kubectl delete deployment kibana-logging -n kube-system --cascade=false
kubectl delete deployment kibana-logging -n kube-system --ignore-not-found
delete rc elasticsearch-logging-v1 -n kube-system --force --now --grace-period=0Resetting etcd after deletion failures
Remove all data under /var/lib/etcd/* and reboot the master node, then recreate the network configuration:
rm -rf /var/lib/etcd/*
etcdctl mk /atomic.io/network/config '{ "Network": "192.168.0.0/16" }'Apiserver start failure due to missing ServiceAccount files
The error “start request repeated too quickly for kube-apiserver.service” often masks a missing ca.crt file when ServiceAccount is enabled. Check /var/run/kubernetes/ca.crt and ensure the certificate files are present.
Permission denied caused by SELinux
Fluentd may fail to write /var/log/fluentd.log if SELinux is enforcing. Disable it by editing /etc/selinux/config (set SELINUX=disabled) and reboot.
Generating ServiceAccount certificates
Create a CA and server certificates, then start the API server manually with the generated files:
openssl genrsa -out ca.key 2048
openssl req -x509 -new -nodes -key ca.key -subj "/CN=k8s-master" -days 10000 -out ca.crt
openssl genrsa -out server.key 2048
echo subjectAltName=IP:10.254.0.1 > extfile.cnf
openssl req -new -key server.key -subj "/CN=k8s-master" -out server.csr
openssl x509 -req -in server.csr -CA ca.crt -CAkey ca.key -CAcreateserial -extfile extfile.cnf -out server.crt -days 10000Start the API server with the appropriate flags, for example:
/usr/bin/kube-apiserver --logtostderr=true --v=0 --etcd-servers=http://k8s-master:2379 --address=0.0.0.0 --port=8080 --service-cluster-ip-range=10.254.0.0/16 --admission-control=ServiceAccount --client-ca-file=/root/keys/ca.crt --tls-cert-file=/root/keys/server.crt --tls-private-key-file=/root/keys/server.key --secure-port=443ETCD startup failures
If etcd fails with “raft save state and entries error: open …/wal/0.tmp: is a directory”, delete the 0.tmp file in the WAL directory and restart.
For nodes that do not start after a power loss, backup the data directory, clear the member directory, stop the other etcd nodes, and restart each node sequentially.
Host trust configuration on CentOS
Generate SSH keys with ssh-keygen -t rsa and distribute the public key using ssh-copy-id to enable password‑less login between hosts.
Changing hostname on CentOS
hostnamectl set-hostname k8s-master1Enabling copy‑paste in VirtualBox guest
Install kernel headers and build tools, then run the Guest Additions installer:
yum install update
yum update kernel
yum update kernel-devel
yum install kernel-headers
yum install gcc gcc make
sh VBoxLinuxAdditions.runForce‑deleting Pods or Namespaces stuck in Terminating
kubectl delete pod NAME --grace-period=0 --force # delete-ns.sh
#!/bin/bash
set -e
usage(){ echo "usage: delns.sh NAMESPACE"; }
if [ $# -lt 1 ]; then usage; exit 1; fi
NAMESPACE=$1
JSONFILE=${NAMESPACE}.json
kubectl get ns "${NAMESPACE}" -o json > "${JSONFILE}"
vi "${JSONFILE}"
curl -k -H "Content-Type: application/json" -X PUT --data-binary @"${JSONFILE}" http://127.0.0.1:8001/api/v1/namespaces/"${NAMESPACE}"/finalizeImpact of containers with only resource requests
Containers that define requests but no limits can be evicted by other pods under resource pressure. Use a LimitRange to enforce default limits.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
MaGe Linux Operations
Founded in 2009, MaGe Education is a top Chinese high‑end IT training brand. Its graduates earn 12K+ RMB salaries, and the school has trained tens of thousands of students. It offers high‑pay courses in Linux cloud operations, Python full‑stack, automation, data analysis, AI, and Go high‑concurrency architecture. Thanks to quality courses and a solid reputation, it has talent partnerships with numerous internet firms.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
