How to Resolve Stuck Kubernetes Resources, Reset etcd, and Fix API Server Errors
This guide explains how to delete inconsistent Kubernetes rc, deployment, and service objects, reset etcd data, address apiserver start failures caused by missing ServiceAccount certificates, disable SELinux for fluentd logs, generate ServiceAccount keys, recover from etcd startup errors, configure host trust, change hostnames, enable VirtualBox copy‑paste, force‑delete pods and namespaces, and avoid resource‑request‑only containers causing contention.
Force‑deleting inconsistent Kubernetes objects
When kubectl hangs and kubectl get shows partially deleted resources, use the following commands to delete the objects without waiting for graceful termination:
kubectl delete deployment kibana-logging -n kube-system --cascade=false
kubectl delete deployment kibana-logging -n kube-system --ignore-not-found
kubectl delete rc elasticsearch-logging-v1 -n kube-system --force --grace-period=0Resetting etcd after deletion failures
To wipe all etcd data and start with a clean state:
rm -rf /var/lib/etcd/*
rebootAfter the node reboots, recreate the network configuration used by the cluster:
etcdctl mk /atomic.io/network/config '{ "Network": "192.168.0.0/16" }'Fixing kube‑apiserver startup failures
The error
start request repeated too quickly for kube-apiserver.serviceis often caused by missing ServiceAccount CA files. Start the API server manually with explicit certificate paths:
/usr/bin/kube-apiserver \
--logtostderr=true --v=0 \
--etcd-servers=http://k8s-master:2379 \
--address=0.0.0.0 --port=8080 \
--service-cluster-ip-range=10.254.0.0/16 \
--admission-control=ServiceAccount \
--client-ca-file=/root/keys/ca.crt \
--tls-cert-file=/root/keys/server.crt \
--tls-private-key-file=/root/keys/server.key \
--basic-auth-file=/root/keys/basic_auth.csv \
--secure-port=443 >> /var/log/kubernetes/kube-apiserver.log &Similarly, start the controller‑manager manually:
/usr/bin/kube-controller-manager \
--logtostderr=true --v=0 \
--master=http://k8s-master:8080 \
--root-ca-file=/root/keys/ca.crt \
--service-account-private-key-file=/root/keys/server.key >> /var/log/kubernetes/kube-controller-manage.log &Resolving SELinux‑related permission errors for Fluentd
Fluentd may fail to create /var/log/fluentd.log when SELinux is enforcing. Disable SELinux and reboot:
sed -i 's/^SELINUX=enforcing/SELINUX=disabled/' /etc/selinux/config
rebootGenerating ServiceAccount certificates
Create a CA and server certificate pair required for ServiceAccount authentication:
openssl genrsa -out ca.key 2048
openssl req -x509 -new -nodes -key ca.key -subj "/CN=k8s-master" -days 10000 -out ca.crt
openssl genrsa -out server.key 2048
echo subjectAltName=IP:10.254.0.1 > extfile.cnf
openssl req -new -key server.key -subj "/CN=k8s-master" -out server.csr
openssl x509 -req -in server.csr -CA ca.crt -CAkey ca.key -CAcreateserial \
-extfile extfile.cnf -out server.crt -days 10000etcd startup failure – case 1 (wal directory)
If the log contains
raft save state and entries error: open /var/lib/etcd/default.etcd/member/wal/0.tmp: is a directory, remove the stray directory and restart etcd:
rm -rf /var/lib/etcd/default.etcd/member/wal/0.tmp
systemctl restart etcdetcd startup failure – case 2 (cluster timeout after power loss)
Synchronize the system clock, back up existing data, clear the data directory, and restart the nodes sequentially:
# Backup existing data
cp -a /var/lib/etcd/default.etcd/member/* /data/bak/
# Remove corrupted data
rm -rf /var/lib/etcd/default.etcd/member/*
# Restart each node
systemctl stop etcd
systemctl restart etcdConfiguring host trust (SSH key exchange)
Generate an RSA key pair on each host and copy the public key to the other hosts:
ssh-keygen -t rsa
ssh-copy-id -i /root/.ssh/id_rsa.pub root@HOST_IP [-p PORT]Changing the hostname on CentOS
hostnamectl set-hostname k8s-master1Enabling copy‑paste in VirtualBox for CentOS
Install kernel headers and Guest Additions, then run the installer:
yum install -y kernel kernel-devel gcc make
sh VBoxLinuxAdditions.runForce‑deleting pods and namespaces stuck in Terminating
Delete a pod immediately:
kubectl delete pod POD_NAME --grace-period=0 --forceDelete a namespace by removing its finalizer:
# delete-ns.sh
#!/bin/bash
set -e
if [ $# -lt 1 ]; then echo "usage: $0 NAMESPACE"; exit 1; fi
NS=$1
kubectl get ns "$NS" -o json > "${NS}.json"
# Edit ${NS}.json to delete the "finalizers" field
curl -k -H "Content-Type: application/json" -X PUT --data-binary @"${NS}.json" \
http://127.0.0.1:8001/api/v1/namespaces/"${NS}"/finalizeImpact of containers with only resource requests (no limits)
Pods that specify resources.requests but omit limits can be evicted when the node is under pressure, potentially causing application failure. Apply a LimitRange policy to enforce default limits for such pods.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Liangxu Linux
Liangxu, a self‑taught IT professional now working as a Linux development engineer at a Fortune 500 multinational, shares extensive Linux knowledge—fundamentals, applications, tools, plus Git, databases, Raspberry Pi, etc. (Reply “Linux” to receive essential resources.)
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
