Master Kubernetes Cluster: Install, Upgrade, Backup, and Restore Step‑by‑Step
This comprehensive guide walks you through installing a Kubernetes cluster with kubeadm, configuring containerd, initializing master and worker nodes, deploying Calico networking and the Dashboard, performing upgrades, renewing certificates, adding or removing nodes, and backing up both etcd data and cluster manifests using scripts and Velero.
Install Kubernetes Cluster
Kubernetes is a container‑orchestration platform that runs as a cluster. As a cluster maintainer you often need to manage the whole lifecycle.
Prerequisites
Cluster nodes: 2
Master IP: 192.168.205.128
Node IP: 192.168.205.130
Kubernetes version: v1.24.2
Container runtime: containerd
OS: CentOS 7.9 (kernel 3.10.0‑1160)
Environment Preparation
(1) Add host entries on each node
cat >> /etc/hosts <<EOF
192.168.205.128 kk-master
192.168.205.130 kk-node01
EOF(2) Disable firewall and SELinux
systemctl stop firewalld
systemctl disable firewalld
setenforce 0
cat /etc/selinux/config
SELINUX=disabled(3) Optimize kernel parameters
cat > /etc/sysctl.d/k8s.conf <<EOF
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
net.ipv4.ip_forward = 1
vm.swappiness=0
EOF
modprobe br_netfilter
sysctl -p /etc/sysctl.d/k8s.conf(4) Disable swap
swapoff -a
# comment swap line in /etc/fstab
sed -i '/ swap / s/^/#/' /etc/fstab(5) Install IPVS modules
cat > /etc/sysconfig/modules/ipvs.modules <<EOF
#!/bin/bash
modprobe -- ip_vs
modprobe -- ip_vs_rr
modprobe -- ip_vs_wrr
modprobe -- ip_vs_sh
modprobe -- nf_conntrack_ipv4
EOF
chmod 755 /etc/sysconfig/modules/ipvs.modules && bash /etc/sysconfig/modules/ipvs.modules && lsmod | grep -e ip_vs -e nf_conntrack_ipv4
yum install -y ipset ipvsadm(6) Sync server time
yum install -y chrony
systemctl enable chronyd
systemctl start chronyd
chronyc sources(7) Install containerd
yum install -y yum-utils \
device-mapper-persistent-data \
lvm2
yum-config-manager --add-repo https://download.docker.com/linux/centos/docker-ce.repo
yum list | grep containerd
yum install -y containerd
mkdir -p /etc/containerd
containerd config default > /etc/containerd/config.toml
sed -i "s#k8s.gcr.io#registry.cn-hangzhou.aliyuncs.com/google_containers#g" /etc/containerd/config.toml
sed -i 's#SystemdCgroup = false#SystemdCgroup = true#g' /etc/containerd/config.toml
sed -i "s#https://registry-1.docker.io#https://registry.cn-hangzhou.aliyuncs.com#g" /etc/containerd/config.toml
systemctl daemon-reload
systemctl enable containerd
systemctl restart containerd(8) Install Kubernetes components
cat <<EOF > /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=http://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64
enabled=1
gpgcheck=0
repo_gpgcheck=0
gpgkey=http://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg http://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg
EOF
yum install -y kubelet-1.24.2 kubeadm-1.24.2 kubectl-1.24.2
crictl config runtime-endpoint /run/containerd/containerd.sock
systemctl daemon-reload
systemctl enable kubelet && systemctl start kubeletInitialize the Cluster
Export the default kubeadm config and edit it (set imageRepository, kube-proxy mode to ipvs, and cgroupDriver to systemd).
kubeadm config print init-defaults > kubeadm.yaml
# edit kubeadm.yaml as needed (example snippet shown below)
apiVersion: kubeadm.k8s.io/v1beta3
kind: InitConfiguration
localAPIEndpoint:
advertiseAddress: 192.168.205.128
bindPort: 6443
nodeRegistration:
criSocket: unix:///var/run/containerd/containerd.sock
name: master
taints: null
---
apiServer:
timeoutForControlPlane: 4m0s
certificatesDir: /etc/kubernetes/pki
clusterName: kubernetes
imageRepository: registry.cn-hangzhou.aliyuncs.com/google_containers
kubernetesVersion: 1.24.2
networking:
dnsDomain: cluster.local
serviceSubnet: 10.96.0.0/12
---
apiVersion: kubeproxy.config.k8s.io/v1alpha1
kind: KubeProxyConfiguration
mode: ipvs
---
apiVersion: kubelet.config.k8s.io/v1beta1
kind: KubeletConfiguration
cgroupDriver: systemdRun the initialization:
kubeadm init --config=kubeadm.yaml
# After success, run as a regular user:
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
# Or as root:
export KUBECONFIG=/etc/kubernetes/admin.confJoin Worker Node
On each worker execute the join command printed by the init output, e.g.:
kubeadm join 192.168.205.128:6443 --token abcdef.0123456789abcdef \
--discovery-token-ca-cert-hash sha256:51b5e566d3f95aaf3170916d67958bc16cb1b44934885a857b07ee58f041334aVerify nodes:
kubectl get nodesInstall Network Plugin (Calico)
wget https://raw.githubusercontent.com/projectcalico/calico/master/manifests/calico.yaml
kubectl apply -f calico.yaml
kubectl get po -n kube-system | grep calicoInstall Kubernetes Dashboard
kubectl apply -f https://raw.githubusercontent.com/kubernetes/dashboard/v2.5.0/aio/deploy/recommended.yaml
kubectl get po -n kubernetes-dashboard
kubectl -n kubernetes-dashboard edit svc kubernetes-dashboard # change type to NodePort
# Access via https://<master_ip>:<nodePort>Generate an admin token (Kubernetes 1.24 no longer auto‑creates ServiceAccount tokens):
cat <<EOF > admin-token.yaml
apiVersion: v1
kind: ServiceAccount
metadata:
name: dashboard-admin
namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: dashboard-admin
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: admin
subjects:
- kind: ServiceAccount
name: dashboard-admin
namespace: kube-system
EOF
kubectl apply -f admin-token.yaml
kubectl -n kube-system get secret $(kubectl -n kube-system get sa/dashboard-admin -o jsonpath="{.secrets[0].name}") -o jsonpath="{.data.token}" | base64 -dUpdate Cluster
Upgrade Kubernetes Version
Check current version and target version (e.g., upgrade from v1.24.0 to v1.24.2).
# Backup first (see backup section)
# Upgrade kubeadm on control plane
yum install -y kubeadm-1.24.2-0 --disableexcludes=kubernetes
kubeadm upgrade plan # shows target version
kubeadm upgrade apply v1.24.2 --config kubeadm.yaml
# Upgrade kubelet and kubectl
yum install -y kubelet-1.24.2-0 kubectl-1.24.2-0 --disableexcludes=kubernetes
systemctl daemon-reload
systemctl restart kubelet
# Drain, upgrade, and uncordon the master node
kubectl cordon kk-master
kubectl drain kk-master --ignore-daemonsets=true
# after upgrade
kubectl uncordon kk-masterUpgrade Worker Nodes
# On each node
yum install -y kubeadm-1.24.2-0 --disableexcludes=kubernetes
kubectl cordon kk-node01
kubectl drain kk-node01 --ignore-daemonsets=true
kubeadm upgrade node
yum install -y kubelet-1.24.2-0 --disableexcludes=kubernetes
systemctl daemon-reload
systemctl restart kubelet
kubectl uncordon kk-node01Renew Certificates
Check expiration:
kubeadm certs check-expirationBackup certificates and etcd, then renew:
mkdir -p /etc/kubernetes.bak.$(date +%Y%m%d)
cp -r /etc/kubernetes/* /etc/kubernetes.bak.$(date +%Y%m%d)
# Renew all certs using the same kubeadm config
kubeadm alpha certs renew all --config=kubeadm.yaml
# Regenerate kubeconfigs
kubeadm init phase kubeconfig all --config=kubeadm.yaml
mv $HOME/.kube/config $HOME/.kube/config.old
cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
chown $(id -u):$(id -g) $HOME/.kube/config
# Restart control‑plane static pods
cd /etc/kubernetes/manifests
mv *.yaml ../
mv ../*.yaml .Add or Remove Nodes
Add node – repeat the environment‑preparation steps on the new host and run the join command obtained from the existing cluster (use kubeadm token create if needed).
# On new node
cat >> /etc/hosts <<EOF
192.168.205.128 kk-master
192.168.205.130 kk-node01
192.168.205.133 kk-node02
EOF
kubeadm token create
kubeadm join 192.168.205.128:6443 --token <token> \
--discovery-token-ca-cert-hash sha256:<hash> --node-name kk-node02Remove node :
kubectl cordon kk-node02
kubectl drain kk-node02 --ignore-daemonsets=true --delete-emptydir-data=true
kubectl delete node kk-node02Backup Cluster
Backup etcd Database
Install etcdctl and take a snapshot:
export ETCDCTL_API=3
etcdctl --endpoints=localhost:2379 \
--cacert=/etc/kubernetes/pki/etcd/ca.crt \
--cert=/etc/kubernetes/pki/etcd/server.crt \
--key=/etc/kubernetes/pki/etcd/server.key \
snapshot save /var/backups/etcd/snapshot.dbAutomate with a shell script (attributes stripped) and add to cron (e.g., every 30 minutes).
#!/bin/bash
ETCDCTL_PATH=/usr/local/bin/etcdctl
ENDPOINTS='192.168.205.128:2379'
BACKUP_DIR="/var/backups/kube_etcd/etcd-$(date +%Y-%m-%d_%H:%M:%S)"
ETCDCTL_CERT="/etc/kubernetes/pki/etcd/server.crt"
ETCDCTL_KEY="/etc/kubernetes/pki/etcd/server.key"
ETCDCTL_CA="/etc/kubernetes/pki/etcd/ca.crt"
mkdir -p "$BACKUP_DIR"
export ETCDCTL_API=3
$ETCDCTL_PATH --endpoints="$ENDPOINTS" \
--cacert="$ETCDCTL_CA" \
--cert="$ETCDCTL_CERT" \
--key="$ETCDCTL_KEY" snapshot save "$BACKUP_DIR/snapshot.db"
# Keep only the latest 5 backups
cd $(dirname "$BACKUP_DIR")
ls -1t | awk 'NR>5{print "rm -rf " $0}' | shRestore etcd
# Stop static pods
cd /etc/kubernetes/manifests
mv *.yaml ../
# Move old data directory aside
mv /var/lib/etcd /var/lib/etcd.bak
# Restore snapshot
ETCDCTL_API=3 etcdctl snapshot restore /var/backups/etcd/snapshot.db \
--name kk-master \
--initial-cluster "kk-master=https://192.168.205.128:2380" \
--initial-cluster-token etcd-cluster \
--initial-advertise-peer-urls https://192.168.205.128:2380 \
--data-dir=/var/lib/etcd
# Restart static pods
mv ../*.yaml .Backup Cluster Manifests with Velero
Install MinIO (object storage) via Helm:
helm repo add minio https://helm.min.io/
helm install minio \
--namespace velero --create-namespace \
--set accessKey=minio,secretKey=minio123 \
--set mode=standalone \
--set service.type=NodePort \
--set persistence.enabled=false minio/minioCreate a bucket named velero in the MinIO UI (http:// :32000).
Create a credentials file (credentials-velero):
[default]
aws_access_key_id=minio
aws_secret_access_key=minio123Install Velero pointing to MinIO:
velero install \
--provider aws \
--bucket velero \
--image velero/velero:v1.6.3 \
--plugins velero/velero-plugin-for-aws:v1.2.1 \
--namespace velero \
--secret-file ./credentials-velero \
--use-volume-snapshots=false \
--use-restic \
--backup-location-config region=minio,s3ForcePathStyle="true",s3Url=http://minio.velero.svc:9000Verify components are running:
kubectl get po -n veleroBackup the default namespace:
velero backup create default-backup-$(date +%Y%m%d) --include-namespaces default --default-volumes-to-resticDelete a resource (e.g., the nginx deployment) and restore it:
kubectl delete deployment nginx
velero restore create --from-backup default-backup-$(date +%Y%m%d)Summary
Kubernetes forms the foundation for cloud‑native applications; reliable backup and upgrade procedures are essential to maintain platform stability. By following the steps above—installing the cluster, configuring networking, managing upgrades, renewing certificates, adding/removing nodes, and backing up both etcd and manifests—you can ensure a resilient and maintainable Kubernetes environment.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Ops Development Stories
Maintained by a like‑minded team, covering both operations and development. Topics span Linux ops, DevOps toolchain, Kubernetes containerization, monitoring, log collection, network security, and Python or Go development. Team members: Qiao Ke, wanger, Dong Ge, Su Xin, Hua Zai, Zheng Ge, Teacher Xia.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
