How to Scale Kubernetes to 5,000 Nodes: Master, API Server, and Component Tuning
This guide explains how to push a Kubernetes cluster toward its theoretical limit of 5,000 nodes by detailing official limits, master node sizing for GCE and AWS, kube‑apiserver high‑availability and connection‑count tuning, scheduler and controller‑manager leader election settings, kubelet optimizations, and DNS anti‑affinity configuration.
Kubernetes has officially claimed support for up to 5,000 nodes per cluster since v1.6, but reaching that number in practice requires careful tuning.
Official Limits
Maximum 5,000 nodes
Maximum 150,000 pods
Maximum 300,000 containers
Maximum 100 pods per node
Master Node Configuration Optimization
GCE recommended instance types :
1‑5 nodes: n1-standard-1
6‑10 nodes: n1-standard-2
11‑100 nodes: n1-standard-4
101‑250 nodes: n1-standard-8
251‑500 nodes: n1-standard-16
More than 500 nodes: n1-standard-32
AWS recommended instance types :
1‑5 nodes: m3.medium
6‑10 nodes: m3.large
11‑100 nodes: m3.xlarge
101‑250 nodes: m3.2xlarge
251‑500 nodes: c4.4xlarge
More than 500 nodes: c4.8xlarge
Corresponding CPU and memory:
1‑5 nodes: 1 vCPU / 3.75 GiB
6‑10 nodes: 2 vCPU / 7.5 GiB
11‑100 nodes: 4 vCPU / 15 GiB
101‑250 nodes: 8 vCPU / 30 GiB
251‑500 nodes: 16 vCPU / 60 GiB
More than 500 nodes: 32 vCPU / 120 GiB
kube‑apiserver Optimization
High Availability
Run multiple kube‑apiserver instances behind an external load balancer.
Set --apiserver-count and --endpoint-reconciler-type so that several instances are added to the Service endpoints, achieving HA.
Because TLS reuses connections, true load‑balancing is limited; a server‑side rate limiter can be added to signal clients to back off when thresholds are reached.
Control Connection Count
The following flags control the number of in‑flight requests:
--max-mutating-requests-inflight int The maximum number of mutating requests in flight at a given time. When the server exceeds this, it rejects requests. Zero for no limit. (default 200)
--max-requests-inflight int The maximum number of non‑mutating requests in flight at a given time. When the server exceeds this, it rejects requests. Zero for no limit. (default 400)Recommended settings:
For clusters with 1,000‑3,000 nodes:
--max-requests-inflight=1500
--max-mutating-requests-inflight=500For clusters with more than 3,000 nodes:
--max-requests-inflight=3000
--max-mutating-requests-inflight=1000kube‑scheduler and kube‑controller‑manager Optimization
High Availability
Both components achieve HA via leader election. Add the following flags:
--leader-elect=true
--leader-elect-lease-duration=15s
--leader-elect-renew-deadline=10s
--leader-elect-resource-lock=endpoints
--leader-elect-retry-period=2sControl QPS
Recommended QPS limit for communication with the API server:
--kube-api-qps=100Kubelet Optimization
Set --image-pull-progress-deadline=30m Set --serialize-image-pulls=false (requires Docker overlay2)
Maximum pods per node: --max-pods=110 (default 110, adjust as needed)
Cluster DNS High Availability
Configure anti‑affinity so that kube‑dns or CoreDNS pods are spread across different nodes, avoiding a single point of failure:
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- weight: 100
labelSelector:
matchExpressions:
- key: k8s-app
operator: In
values:
- kube-dns
topologyKey: kubernetes.io/hostnameFull-Stack DevOps & Kubernetes
Focused on sharing DevOps, Kubernetes, Linux, Docker, Istio, microservices, Spring Cloud, Python, Go, databases, Nginx, Tomcat, cloud computing, and related technologies.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
