How to Scale a Kubernetes Cluster: Node Quotas, Kernel Tweaks, and Component Settings
This guide explains how to prepare a large‑scale Kubernetes cluster by increasing cloud resource quotas, adjusting kernel parameters, configuring master node sizes, optimizing etcd storage, tuning Docker and Kubelet image pull settings, and applying best‑practice pod and scheduler configurations for thousands of nodes.
1. Node Quotas and Kernel Parameters
When a public‑cloud Kubernetes cluster grows, you may hit resource quota limits. Increase the following quotas on the cloud platform before scaling:
Number of virtual machines
vCPU count
Private IP address count
Public IP address count
Security group rules
Route table entries
Persistent storage size
Recommended master node specifications for Google Compute Engine (GCE) based on node count:
1‑5 nodes: n1-standard-1 6‑10 nodes: n1-standard-2 11‑100 nodes: n1-standard-4 101‑250 nodes: n1-standard-8 251‑500 nodes: n1-standard-16 More than 500 nodes: n1-standard-32 Alibaba Cloud equivalents (node count → master spec): 1‑5 nodes → 4C8G, 6‑20 → 4C16G, 21‑100 → 8C32G, 100‑200 → 16C64G.
Key kernel parameters to add to /etc/sysctl.conf:
fs.file-max=1000000
# Increase the system‑wide open file limit to avoid "Too many open files" errors.
net.ipv4.neigh.default.gc_thresh1=1024
net.ipv4.neigh.default.gc_thresh2=4096
net.ipv4.neigh.default.gc_thresh3=8192
# ARP cache thresholds – raise when the ARP table becomes large.
net.netfilter.nf_conntrack_max=10485760
net.netfilter.nf_conntrack_tcp_timeout_established=300
net.netfilter.nf_conntrack_buckets=655360
net.core.netdev_max_backlog=10000
fs.inotify.max_user_instances=524288
fs.inotify.max_user_watches=5242882. Etcd Database
Deploy a highly available etcd cluster with the etcd operator . The operator automates creation, scaling, backup, and upgrade of etcd instances.
create/destroy : automatic provisioning and removal of etcd clusters.
resize : dynamic scaling of cluster size.
backup : supports data backup and cluster restoration.
upgrade : enables version upgrades without service interruption.
Additional recommendations:
Use SSDs for etcd storage.
Set --quota-backend-bytes to increase the storage limit (default 2 GB).
Store kube‑apiserver events in a dedicated etcd cluster.
3. Image Pull Configuration
Docker daemon settings:
Set max-concurrent-downloads=10 (default 3) to speed up parallel image pulls.
Use SSD storage for the Docker image cache.
Pre‑load the pause image (e.g., docker image save -o /opt/preloaded_docker_images.tar and docker image load -i /opt/preloaded_docker_images.tar) to avoid pulling it on every pod start.
Kubelet settings:
Disable serialized image pulls: --serialize-image-pulls=false (default true). Note: Docker <1.9 with AUFS cannot use this flag.
Increase pull timeout: --image-pull-progress-deadline=30 (default 60 seconds).
Adjust the maximum number of pods per node: --max-pods=110 (default 110, can be changed as needed).
4. Scheduler and Controller Manager Settings
For >3000 nodes, set --max-requests-inflight=3000 and --max-mutating-requests-inflight=1000.
For 1000‑3000 nodes, use --max-requests-inflight=1500 and --max-mutating-requests-inflight=500.
Memory target per node (in MB): --target-ram-mb=node_nums * 60.
Increase API server QPS: --kube-api-qps=100 (default 50) for both scheduler and controller‑manager.
Increase burst capacity: --kube-api-burst=100 (default 30 for controller‑manager).
5. Pod Configuration Best Practices
Define resource requests and limits for containers, especially core system services:
spec.containers[].resources.limits.cpu spec.containers[].resources.limits.memory spec.containers[].resources.requests.cpu spec.containers[].resources.requests.memory spec.containers[].resources.limits.ephemeral-storage spec.containers[].resources.requests.ephemeral-storageKubernetes classifies pods into QoS classes based on these settings: Guaranteed , Burstable , and BestEffort . When node resources are scarce, the kubelet evicts pods in the order BestEffort > Burstable > Guaranteed .
Use affinity rules to protect critical workloads, e.g.:
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- weight: 100
labelSelector:
matchExpressions:
- key: k8s-app
operator: In
values:
- kube-dns
topologyKey: kubernetes.io/hostnamePrefer managing workloads with controllers such as Deployment, StatefulSet, DaemonSet, or Job.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Full-Stack DevOps & Kubernetes
Focused on sharing DevOps, Kubernetes, Linux, Docker, Istio, microservices, Spring Cloud, Python, Go, databases, Nginx, Tomcat, cloud computing, and related technologies.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
