Fix Inconsistent Kubernetes rc/deployment/service Deletions and Etcd Failures
This guide walks through troubleshooting Kubernetes issues such as partially deleted resources, resetting etcd, apiserver start failures due to missing ServiceAccount certificates, SELinux permission errors, ServiceAccount key generation, etcd startup errors, host trust configuration, and resource limit pitfalls, providing concrete commands and scripts for each problem.
How to Delete Inconsistent rc, Deployment, Service
Sometimes kubectl hangs and a kubectl get shows resources only partially deleted.
[root@k8s-master ~]# kubectl get -f fluentd-elasticsearch/
NAME DESIRED CURRENT READY AGE
rc/elasticsearch-logging-v1 0 2 2 15h
NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE
deploy/kibana-logging 0 1 1 1 15h
Error from server (NotFound): services "elasticsearch-logging" not found
Error from server (NotFound): daemonsets.extensions "fluentd-es-v1.22" not found
Error from server (NotFound): services "kibana-logging" not foundDelete the problematic resources with:
kubectl delete deployment kibana-logging -n kube-system --cascade=false
kubectl delete deployment kibana-logging -n kube-system --ignore-not-found
kubectl delete rc elasticsearch-logging-v1 -n kube-system --force --grace-period=0How to Reset Etcd When Deletion Fails
rm -rf /var/lib/etcd/*Reboot the master node, then recreate the network configuration:
etcdctl mk /atomic.io/network/config '{ "Network": "192.168.0.0/16" }'Apiserver Startup Failure
The service repeatedly fails with “start request repeated too quickly”. The real cause is missing CA files after enabling ServiceAccount.
May 21 07:56:41 k8s-master kube-apiserver: Flag --port has been deprecated, see --insecure-port instead.
May 21 07:56:41 k8s-master kube-apiserver: Validate server run options failed: unable to load client CA file: open /var/run/kubernetes/ca.crt: no such file or directory
...Ensure the ServiceAccount CA files are present or start the API server manually:
/usr/bin/kube-apiserver --logtostderr=true --v=0 --etcd-servers=http://k8s-master:2379 --address=0.0.0.0 --port=8080 --kubelet-port=10250 --allow-privileged=true --service-cluster-ip-range=10.254.0.0/16 --admission-control=ServiceAccount --insecure-bind-address=0.0.0.0 --client-ca-file=/root/keys/ca.crt --tls-cert-file=/root/keys/server.crt --tls-private-key-file=/root/keys/server.key --basic-auth-file=/root/keys/basic_auth.csv --secure-port=443 >> /var/log/kubernetes/kube-apiserver.log &Permission Denied Errors
Fluentd may fail to write logs because SELinux is enforcing.
# Edit /etc/selinux/config
SELINUX=enforcing → SELINUX=disabled
rebootServiceAccount‑Based Configuration
Generate the required certificates and keys:
openssl genrsa -out ca.key 2048
openssl req -x509 -new -nodes -key ca.key -subj "/CN=k8s-master" -days 10000 -out ca.crt
openssl genrsa -out server.key 2048
echo subjectAltName=IP:10.254.0.1 > extfile.cnf
openssl req -new -key server.key -subj "/CN=k8s-master" -out server.csr
openssl x509 -req -in server.csr -CA ca.crt -CAkey ca.key -CAcreateserial -extfile extfile.cnf -out server.crt -days 10000If the apiserver configuration points to missing CA files, you will see:
Validate server run options failed: unable to load client CA file: open /root/keys/ca.crt: permission deniedStart the controller‑manager manually as needed:
/usr/bin/kube-controller-manager --logtostderr=true --v=0 --master=http://k8s-master:8080 --root-ca-file=/root/keys/ca.crt --service-account-private-key-file=/root/keys/server.key & >> /var/log/kubernetes/kube-controller-manage.logEtcd Won’t Start – Issue (1)
Log shows the raft error:
raft save state and entries error: open /var/lib/etcd/default.etcd/member/wal/0.tmp: is a directoryDelete the stray 0.tmp file in the WAL directory and restart etcd.
Etcd Won’t Start – Timeout Issue (2)
After a power loss, one etcd node fails to start. The fix is:
Backup the data directory: cd /var/lib/etcd/default.etcd/member && cp * /data/bak/ Remove all files in the member directory: rm -rf /var/lib/etcd/default.etcd/member/* Stop the other two etcd nodes, then restart all nodes:
# master node
systemctl stop etcd
systemctl restart etcd
# node1
systemctl stop etcd
systemctl restart etcd
# node2
systemctl stop etcd
systemctl restart etcdConfigure Host Trust on CentOS
Generate SSH keys and distribute them:
ssh-keygen -t rsa
ssh-copy-id -i /root/.ssh/id_rsa.pub [email protected] (-p 2222)Change CentOS Hostname
hostnamectl set-hostname k8s-master1Enable Copy‑Paste in VirtualBox Guest
yum install update
yum update kernel
yum update kernel-devel
yum install kernel-headers
yum install gcc
yum install gcc make
sh VBoxLinuxAdditions.runForce‑Delete a Stuck Pod
kubectl delete pod NAME --grace-period=0 --forceForce‑Delete a Stuck Namespace
# delete-ns.sh
#!/bin/bash
set -e
usage(){
echo "usage:"
echo " delns.sh NAMESPACE"
}
if [ $# -lt 1 ]; then
usage
exit
fi
NAMESPACE=$1
JSONFILE=${NAMESPACE}.json
kubectl get ns "${NAMESPACE}" -o json > "${JSONFILE}"
vi "${JSONFILE}"
curl -k -H "Content-Type: application/json" -X PUT --data-binary @"${JSONFILE}" \
http://127.0.0.1:8001/api/v1/namespaces/"${NAMESPACE}"/finalizeWhat Happens When a Container Has Requests but No Limits?
Example pod spec:
- name: busybox-cnt02
image: busybox
command: ["/bin/sh"]
args: ["-c", "while true; do echo hello from cnt02; sleep 10;done"]
resources:
requests:
memory: "100Mi"
cpu: "100m"Without a limits section, the container can be evicted by other pods that have limits, potentially causing application failure. Use a LimitRange policy to enforce limits automatically.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Open Source Linux
Focused on sharing Linux/Unix content, covering fundamentals, system development, network programming, automation/operations, cloud computing, and related professional knowledge.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
