Cloud Native 36 min read

Master Kubernetes Cluster: Install, Upgrade, Backup, and Restore Step‑by‑Step

This comprehensive guide walks you through installing a Kubernetes cluster with kubeadm, configuring containerd, initializing master and worker nodes, deploying Calico networking and the Dashboard, performing upgrades, renewing certificates, adding or removing nodes, and backing up both etcd data and cluster manifests using scripts and Velero.

Ops Development Stories

Aug 9, 2022

Master Kubernetes Cluster: Install, Upgrade, Backup, and Restore Step‑by‑Step

Install Kubernetes Cluster

Kubernetes is a container‑orchestration platform that runs as a cluster. As a cluster maintainer you often need to manage the whole lifecycle.

Prerequisites

Cluster nodes: 2

Master IP: 192.168.205.128

Node IP: 192.168.205.130

Kubernetes version: v1.24.2

Container runtime: containerd

OS: CentOS 7.9 (kernel 3.10.0‑1160)

Environment Preparation

(1) Add host entries on each node

cat >> /etc/hosts <<EOF
192.168.205.128 kk-master
192.168.205.130 kk-node01
EOF

(2) Disable firewall and SELinux

systemctl stop firewalld
systemctl disable firewalld
setenforce 0
cat /etc/selinux/config
SELINUX=disabled

(3) Optimize kernel parameters

cat > /etc/sysctl.d/k8s.conf <<EOF
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
net.ipv4.ip_forward = 1
vm.swappiness=0
EOF
modprobe br_netfilter
sysctl -p /etc/sysctl.d/k8s.conf

(4) Disable swap

swapoff -a
# comment swap line in /etc/fstab
sed -i '/ swap / s/^/#/' /etc/fstab

(5) Install IPVS modules

cat > /etc/sysconfig/modules/ipvs.modules <<EOF
#!/bin/bash
modprobe -- ip_vs
modprobe -- ip_vs_rr
modprobe -- ip_vs_wrr
modprobe -- ip_vs_sh
modprobe -- nf_conntrack_ipv4
EOF
chmod 755 /etc/sysconfig/modules/ipvs.modules && bash /etc/sysconfig/modules/ipvs.modules && lsmod | grep -e ip_vs -e nf_conntrack_ipv4
yum install -y ipset ipvsadm

(6) Sync server time

yum install -y chrony
systemctl enable chronyd
systemctl start chronyd
chronyc sources

(7) Install containerd

yum install -y yum-utils \ 
  device-mapper-persistent-data \ 
  lvm2
yum-config-manager --add-repo https://download.docker.com/linux/centos/docker-ce.repo
yum list | grep containerd
yum install -y containerd
mkdir -p /etc/containerd
containerd config default > /etc/containerd/config.toml
sed -i "s#k8s.gcr.io#registry.cn-hangzhou.aliyuncs.com/google_containers#g" /etc/containerd/config.toml
sed -i 's#SystemdCgroup = false#SystemdCgroup = true#g' /etc/containerd/config.toml
sed -i "s#https://registry-1.docker.io#https://registry.cn-hangzhou.aliyuncs.com#g" /etc/containerd/config.toml
systemctl daemon-reload
systemctl enable containerd
systemctl restart containerd

(8) Install Kubernetes components

cat <<EOF > /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=http://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64
enabled=1
gpgcheck=0
repo_gpgcheck=0
gpgkey=http://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg http://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg
EOF

yum install -y kubelet-1.24.2 kubeadm-1.24.2 kubectl-1.24.2
crictl config runtime-endpoint /run/containerd/containerd.sock
systemctl daemon-reload
systemctl enable kubelet && systemctl start kubelet

Initialize the Cluster

Export the default kubeadm config and edit it (set imageRepository, kube-proxy mode to ipvs, and cgroupDriver to systemd).

kubeadm config print init-defaults > kubeadm.yaml
# edit kubeadm.yaml as needed (example snippet shown below)
apiVersion: kubeadm.k8s.io/v1beta3
kind: InitConfiguration
localAPIEndpoint:
  advertiseAddress: 192.168.205.128
  bindPort: 6443
nodeRegistration:
  criSocket: unix:///var/run/containerd/containerd.sock
  name: master
  taints: null
---
apiServer:
  timeoutForControlPlane: 4m0s
certificatesDir: /etc/kubernetes/pki
clusterName: kubernetes
imageRepository: registry.cn-hangzhou.aliyuncs.com/google_containers
kubernetesVersion: 1.24.2
networking:
  dnsDomain: cluster.local
  serviceSubnet: 10.96.0.0/12
---
apiVersion: kubeproxy.config.k8s.io/v1alpha1
kind: KubeProxyConfiguration
mode: ipvs
---
apiVersion: kubelet.config.k8s.io/v1beta1
kind: KubeletConfiguration
cgroupDriver: systemd

Run the initialization:

kubeadm init --config=kubeadm.yaml
# After success, run as a regular user:
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
# Or as root:
export KUBECONFIG=/etc/kubernetes/admin.conf

Join Worker Node

On each worker execute the join command printed by the init output, e.g.:

kubeadm join 192.168.205.128:6443 --token abcdef.0123456789abcdef \
    --discovery-token-ca-cert-hash sha256:51b5e566d3f95aaf3170916d67958bc16cb1b44934885a857b07ee58f041334a

Verify nodes:

kubectl get nodes

Install Network Plugin (Calico)

wget https://raw.githubusercontent.com/projectcalico/calico/master/manifests/calico.yaml
kubectl apply -f calico.yaml
kubectl get po -n kube-system | grep calico

Install Kubernetes Dashboard

kubectl apply -f https://raw.githubusercontent.com/kubernetes/dashboard/v2.5.0/aio/deploy/recommended.yaml
kubectl get po -n kubernetes-dashboard
kubectl -n kubernetes-dashboard edit svc kubernetes-dashboard   # change type to NodePort
# Access via https://<master_ip>:<nodePort>

Generate an admin token (Kubernetes 1.24 no longer auto‑creates ServiceAccount tokens):

cat <<EOF > admin-token.yaml
apiVersion: v1
kind: ServiceAccount
metadata:
  name: dashboard-admin
  namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: dashboard-admin
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: admin
subjects:
- kind: ServiceAccount
  name: dashboard-admin
  namespace: kube-system
EOF
kubectl apply -f admin-token.yaml
kubectl -n kube-system get secret $(kubectl -n kube-system get sa/dashboard-admin -o jsonpath="{.secrets[0].name}") -o jsonpath="{.data.token}" | base64 -d

Update Cluster

Upgrade Kubernetes Version

Check current version and target version (e.g., upgrade from v1.24.0 to v1.24.2).

# Backup first (see backup section)
# Upgrade kubeadm on control plane
yum install -y kubeadm-1.24.2-0 --disableexcludes=kubernetes
kubeadm upgrade plan   # shows target version
kubeadm upgrade apply v1.24.2 --config kubeadm.yaml
# Upgrade kubelet and kubectl
yum install -y kubelet-1.24.2-0 kubectl-1.24.2-0 --disableexcludes=kubernetes
systemctl daemon-reload
systemctl restart kubelet
# Drain, upgrade, and uncordon the master node
kubectl cordon kk-master
kubectl drain kk-master --ignore-daemonsets=true
# after upgrade
kubectl uncordon kk-master

Upgrade Worker Nodes

# On each node
yum install -y kubeadm-1.24.2-0 --disableexcludes=kubernetes
kubectl cordon kk-node01
kubectl drain kk-node01 --ignore-daemonsets=true
kubeadm upgrade node
yum install -y kubelet-1.24.2-0 --disableexcludes=kubernetes
systemctl daemon-reload
systemctl restart kubelet
kubectl uncordon kk-node01

Renew Certificates

Check expiration:

kubeadm certs check-expiration

Backup certificates and etcd, then renew:

mkdir -p /etc/kubernetes.bak.$(date +%Y%m%d)
cp -r /etc/kubernetes/* /etc/kubernetes.bak.$(date +%Y%m%d)
# Renew all certs using the same kubeadm config
kubeadm alpha certs renew all --config=kubeadm.yaml
# Regenerate kubeconfigs
kubeadm init phase kubeconfig all --config=kubeadm.yaml
mv $HOME/.kube/config $HOME/.kube/config.old
cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
chown $(id -u):$(id -g) $HOME/.kube/config
# Restart control‑plane static pods
cd /etc/kubernetes/manifests
mv *.yaml ../
mv ../*.yaml .

Add or Remove Nodes

Add node – repeat the environment‑preparation steps on the new host and run the join command obtained from the existing cluster (use kubeadm token create if needed).

# On new node
cat >> /etc/hosts <<EOF
192.168.205.128 kk-master
192.168.205.130 kk-node01
192.168.205.133 kk-node02
EOF
kubeadm token create
kubeadm join 192.168.205.128:6443 --token <token> \
    --discovery-token-ca-cert-hash sha256:<hash> --node-name kk-node02

Remove node :

kubectl cordon kk-node02
kubectl drain kk-node02 --ignore-daemonsets=true --delete-emptydir-data=true
kubectl delete node kk-node02

Backup Cluster

Backup etcd Database

Install etcdctl and take a snapshot:

export ETCDCTL_API=3
etcdctl --endpoints=localhost:2379 \
  --cacert=/etc/kubernetes/pki/etcd/ca.crt \
  --cert=/etc/kubernetes/pki/etcd/server.crt \
  --key=/etc/kubernetes/pki/etcd/server.key \
  snapshot save /var/backups/etcd/snapshot.db

Automate with a shell script (attributes stripped) and add to cron (e.g., every 30 minutes).

#!/bin/bash
ETCDCTL_PATH=/usr/local/bin/etcdctl
ENDPOINTS='192.168.205.128:2379'
BACKUP_DIR="/var/backups/kube_etcd/etcd-$(date +%Y-%m-%d_%H:%M:%S)"
ETCDCTL_CERT="/etc/kubernetes/pki/etcd/server.crt"
ETCDCTL_KEY="/etc/kubernetes/pki/etcd/server.key"
ETCDCTL_CA="/etc/kubernetes/pki/etcd/ca.crt"
mkdir -p "$BACKUP_DIR"
export ETCDCTL_API=3
$ETCDCTL_PATH --endpoints="$ENDPOINTS" \
  --cacert="$ETCDCTL_CA" \
  --cert="$ETCDCTL_CERT" \
  --key="$ETCDCTL_KEY" snapshot save "$BACKUP_DIR/snapshot.db"
# Keep only the latest 5 backups
cd $(dirname "$BACKUP_DIR")
ls -1t | awk 'NR>5{print "rm -rf " $0}' | sh

Restore etcd

# Stop static pods
cd /etc/kubernetes/manifests
mv *.yaml ../
# Move old data directory aside
mv /var/lib/etcd /var/lib/etcd.bak
# Restore snapshot
ETCDCTL_API=3 etcdctl snapshot restore /var/backups/etcd/snapshot.db \
  --name kk-master \
  --initial-cluster "kk-master=https://192.168.205.128:2380" \
  --initial-cluster-token etcd-cluster \
  --initial-advertise-peer-urls https://192.168.205.128:2380 \
  --data-dir=/var/lib/etcd
# Restart static pods
mv ../*.yaml .

Backup Cluster Manifests with Velero

Install MinIO (object storage) via Helm:

helm repo add minio https://helm.min.io/
helm install minio \
  --namespace velero --create-namespace \
  --set accessKey=minio,secretKey=minio123 \
  --set mode=standalone \
  --set service.type=NodePort \
  --set persistence.enabled=false minio/minio

Create a bucket named velero in the MinIO UI (http:// :32000).

Create a credentials file (credentials-velero):

[default]
aws_access_key_id=minio
aws_secret_access_key=minio123

Install Velero pointing to MinIO:

velero install \
  --provider aws \
  --bucket velero \
  --image velero/velero:v1.6.3 \
  --plugins velero/velero-plugin-for-aws:v1.2.1 \
  --namespace velero \
  --secret-file ./credentials-velero \
  --use-volume-snapshots=false \
  --use-restic \
  --backup-location-config region=minio,s3ForcePathStyle="true",s3Url=http://minio.velero.svc:9000

Verify components are running:

kubectl get po -n velero

Backup the default namespace:

velero backup create default-backup-$(date +%Y%m%d) --include-namespaces default --default-volumes-to-restic

Delete a resource (e.g., the nginx deployment) and restore it:

kubectl delete deployment nginx
velero restore create --from-backup default-backup-$(date +%Y%m%d)

Summary

Kubernetes forms the foundation for cloud‑native applications; reliable backup and upgrade procedures are essential to maintain platform stability. By following the steps above—installing the cluster, configuring networking, managing upgrades, renewing certificates, adding/removing nodes, and backing up both etcd and manifests—you can ensure a resilient and maintainable Kubernetes environment.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Kubernetes Cluster Upgrade Calico kubeadm Velero etcd backup Kubernetes Dashboard

Written by

Ops Development Stories

Maintained by a like‑minded team, covering both operations and development. Topics span Linux ops, DevOps toolchain, Kubernetes containerization, monitoring, log collection, network security, and Python or Go development. Team members: Qiao Ke, wanger, Dong Ge, Su Xin, Hua Zai, Zheng Ge, Teacher Xia.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.