How to Fix etcd “NOSPACE” Errors in Kubernetes Clusters
When a Kubernetes cluster’s etcd reaches its default 2 GB quota, it triggers a “NOSPACE” alarm that blocks all write operations, causing critical services to fail; this guide explains the root cause, how to diagnose the issue with etcdctl, and step‑by‑step remediation including compaction, defragmentation, and quota expansion.
In production Kubernetes clusters the stability of the core etcd component is critical. When the etcd backend storage reaches its size limit, the error
rpc error: code = ResourceExhausted desc = etcdserver: mvcc: database space exceeded
is emitted, triggering the alarm:NOSPACE alarm and causing all write operations (including those from the Kubernetes API server) to be rejected.
Root Cause
etcd stores key‑value data in BoltDB. By default the backend size quota is 2 GB (2147483648 bytes) and can be changed with the --quota-backend-bytes flag. When the quota is exhausted etcd raises the mvcc: database space exceeded error and sets the alarm:NOSPACE alarm.
Investigation
Check the current status of the etcd cluster:
ETCDCTL_API=3 etcdctl --endpoints=192.168.x.x:2381 endpoint status --write-out=tableThe output shows fields such as DB SIZE and ERRORS. An example result when the quota is exceeded:
DB SIZE : 2.0 GB
ERRORS : alarm:NOSPACE
You can also query the exact size in bytes:
ETCDCTL_API=3 etcdctl --endpoints=127.0.0.1:2379 endpoint status | grep dbSizeSolution
1️⃣ Compact and Defragment
Compaction removes old revisions and frees space. First obtain the latest revision number:
ETCDCTL_API=3 etcdctl --endpoints=127.0.0.1:2379 endpoint status --write-out=json | jq '.[0].Status.dbSize'Assuming the latest revision is 100000, run:
ETCDCTL_API=3 etcdctl --endpoints=127.0.0.1:2379 compact 100000After compaction, reclaim fragmented space:
ETCDCTL_API=3 etcdctl --endpoints=127.0.0.1:2379 defrag2️⃣ Temporarily Increase the Backend Quota
If space is still insufficient, increase the quota (temporary measure). Edit the etcd systemd unit file or the static pod manifest and add a larger value, for example 8 GB: --quota-backend-bytes=8589934592 This raises the backend limit to 8 GB (8 × 1024³ bytes). Remember to revert to a sensible quota after the issue is resolved.
Preventive Recommendations
Schedule regular compact and defrag operations (e.g., via a cron job).
Enable automatic compaction in the etcd configuration:
--auto-compaction-retention=1h --auto-compaction-mode=periodicWith these settings etcd will automatically compress historical data every hour, preventing the database from running out of space again.
Full-Stack DevOps & Kubernetes
Focused on sharing DevOps, Kubernetes, Linux, Docker, Istio, microservices, Spring Cloud, Python, Go, databases, Nginx, Tomcat, cloud computing, and related technologies.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
