How to Build a High-Performance, Highly-Available Production Kubernetes Cluster
This guide walks you through planning, configuring, and deploying a production‑grade Kubernetes cluster with high performance and availability, covering host planning, HA load balancing with keepalived and HAProxy, Harbor setup, node initialization, and essential system tweaks, all illustrated with ready‑to‑run code snippets.
In today's highly digital world, Kubernetes (K8s) has become the dominant container orchestration platform. Both startups and enterprises seek to leverage Kubernetes for scalability, availability, and security, but deploying it in production requires careful planning and testing.
Deployment Planning
Host specifications for the cluster are as follows:
k8s-master01 – 172.139.20.121 – 4c8g – master – VIP: /
k8s-master02 – 172.139.20.176 – 4c8g – master – VIP: /
k8s-master03 – 172.139.20.151 – 4c8g – master – VIP: /
k8s-node01 – 172.139.20.175 – 2c4g – node – VIP: /
k8s-node02 – 172.139.20.75 – 2c4g – node – VIP: /
lb-haproxy01 – 172.139.20.3 – 1c2g – k8s-lb – VIP: 172.139.20.100
lb-haproxy02 – 172.139.20.92 – 1c2g – k8s-lb – VIP: 172.139.20.100
Apiserver HA Load Balancing
In production, the Apiserver must be highly available and handle high concurrency, requiring two additional hosts for HA load balancing. Use keepalived for HA and HAProxy for load balancing, both started via Docker Compose.
Deploy Harbor Service
<code>$ sudo mkdir -p /etc/haproxy</code> <code>$ cat <<'EOF' | sudo tee /etc/haproxy/haproxy.cfg > /dev/null
global
daemon
maxconn 256
user haproxy
defaults
mode tcp
timeout connect 5000ms
timeout client 50000ms
timeout server 50000ms
listen stats
bind *:1080
mode http
stats enable
stats uri /admin?stats
stats realm HAProxy Statistics
stats auth admin1:AdMiN123
listen apiserver-tcp
bind *:6443
server apiserver01 172.139.20.121:6443 maxconn 32 check
server apiserver02 172.139.20.176:6443 maxconn 32 check
server apiserver03 172.139.20.151:6443 maxconn 32 check
EOF</code> <code>$ cat <<'EOF' | sudo tee /etc/haproxy/docker-compose.yml > /dev/null
name: haproxy
services:
haproxy:
container_name: haproxy
image: haproxy:2.9-alpine
volumes:
- ./haproxy.cfg:/usr/local/etc/haproxy/haproxy.cfg:ro
ports:
- 1080:1080
- 6443:6443
EOF</code> <code>$ sudo docker-compose -f /etc/haproxy/docker-compose.yml up -d</code>Deploy keepalived Service
<code>$ sudo mkdir -p /etc/keepalived/logs</code> <code>$ cat <<-EOF | sudo tee /etc/keepalived/keepalived.conf > /dev/null
! Configuration File for keepalived
global_defs {
max_auto_priority -1
enable_script_security
vrrp_skip_check_adv_addr
vrrp_garp_master_repeat 3
vrrp_garp_master_delay 10
vrrp_garp_master_refresh 60
}
include /etc/keepalived/apiserver.conf
EOF
$ cat <<-EOF | sudo tee /etc/keepalived/apiserver.conf > /dev/null
vrrp_script apiserver {
script "/etc/keepalived/chk_apiserver.sh"
user root
interval 1
fall 5
rise 3
weight -10
}
vrrp_instance apiserver {
state BACKUP
interface eth0
virtual_router_id 100
priority 100
authentication {
auth_type PASS
auth_pass pwd100
}
unicast_src_ip 172.139.20.3
unicast_peer {
172.139.20.92
}
virtual_ipaddress {
172.139.20.100
}
track_script {
apiserver
}
}
EOF</code> <code>$ cat <<'EOF' | sudo tee /etc/keepalived/chk_apiserver.sh > /dev/null
#!/bin/sh
result=$(curl -sk https://localhost:6443/healthz)
if [[ "$result" == 'ok' ]]; then
exit 0
else
exit 255
fi
EOF
$ sudo chmod +x /etc/keepalived/chk_apiserver.sh</code> <code>$ cat <<-EOF | sudo tee /etc/keepalived/docker-compose.yml > /dev/null
services:
keepalived:
container_name: keepalived
image: registry.cn-guangzhou.aliyuncs.com/jiaxzeng6918/keepalived:2.2.8-alpine3.18
volumes:
- "/usr/share/zoneinfo/Asia/Shanghai:/etc/localtime:ro"
- ".:/etc/keepalived"
cap_add:
- NET_ADMIN
network_mode: "host"
restart: always
EOF</code>Deploy Kubernetes Cluster
Cluster deployment consists of the following steps: initialize nodes, install runtime on all hosts, deploy master role, deploy node role, and install the Calico CNI plugin.
Initialize Nodes
<code>$ sudo systemctl disable firewalld --now
$ sudo setenforce 0
$ sudo sed -ri 's/^(SELINUX)=.*$/\1=disabled/' /etc/selinux/config
$ sudo swapoff -a
$ sudo sed -i '/ swap / s/^\(.*\)$/#\1/g' /etc/fstab
$ cat <<-EOF | sudo tee /etc/sysconfig/modules/ipvs.modules > /dev/null
#!/bin/bash
modprobe -- ip_vs
modprobe -- ip_vs_rr
modprobe -- ip_vs_wrr
modprobe -- ip_vs_sh
modprobe -- nf_conntrack
modprobe -- br_netfilter
modprobe -- ipip
EOF
$ sudo chmod 755 /etc/sysconfig/modules/ipvs.modules
$ sudo bash /etc/sysconfig/modules/ipvs.modules
$ cat <<-EOF | sudo tee /etc/sysctl.d/kubernetes.conf > /dev/null
net.bridge.bridge-nf-call-iptables = 1
net.bridge.bridge-nf-call-ip6tables = 1
net.ipv4.ip_forward = 1
net.ipv4.tcp_tw_reuse = 1
net.ipv4.tcp_tw_recycle = 0
net.ipv4.ip_local_port_range = 32768 65535
net.ipv4.tcp_max_tw_buckets = 65535
net.ipv4.conf.all.rp_filter = 0
net.ipv6.conf.all.forwarding = 1
net.ipv4.conf.all.forwarding = 1
net.ipv4.tcp_fin_timeout = 15
EOF
$ sudo sysctl -p /etc/sysctl.d/kubernetes.conf</code>Install Runtime
Install Docker or containerd on all hosts as described in the referenced runtime articles.
Deploy Master Nodes
<code>cat <<'EOF' | tee initCluster.yml > /dev/null
apiVersion: kubeadm.k8s.io/v1beta3
kind: ClusterConfiguration
clusterName: kubernetes
kubernetesVersion: 1.27.16
controlPlaneEndpoint: 172.139.20.100
apiServer:
certSANs: ["localhost", "172.139.20.121", "172.139.20.176", "172.139.20.151"]
imageRepository: 172.139.20.170:5000/library
networking:
dnsDomain: cluster.local
podSubnet: "10.244.0.0/24"
serviceSubnet: 10.96.0.0/12
etcd:
local:
dataDir: /data/etcd
---
apiVersion: kubeadm.k8s.io/v1beta3
kind: InitConfiguration
nodeRegistration:
criSocket: unix:///var/run/containerd/containerd.sock
imagePullPolicy: IfNotPresent
taints: null
---
apiVersion: kubeproxy.config.k8s.io/v1alpha1
kind: KubeProxyConfiguration
mode: ipvs
EOF
$ sudo kubeadm config images pull --config initCluster.yml
$ sudo kubeadm init --config initCluster.yml
</code>Generate a join token on the first master and use it to add the remaining masters and nodes:
<code># On one master
$ kubeadm token create --print-join-command
# Example output:
# kubeadm join 172.139.20.100:6443 --token <token> --discovery-token-ca-cert-hash sha256:<hash>
# Upload control‑plane certificates
$ sudo kubeadm init phase upload-certs --upload-certs 2>/dev/null | tail -n1
# On other masters
$ sudo kubeadm join 172.139.20.100:6443 \
--token <token> \
--discovery-token-ca-cert-hash sha256:<hash> \
--certificate-key <cert-key> \
--control-plane
</code> <code>$ mkdir -p $HOME/.kube
$ sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
$ sudo chown $(id -u):$(id -g) $HOME/.kube/config</code>Conclusion
The above steps describe a production‑grade Kubernetes cluster with only master nodes; node expansion will be covered in a future article.
Linux Ops Smart Journey
The operations journey never stops—pursuing excellence endlessly.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.