Operations 6 min read

How to Diagnose and Fix Node2 Ceph‑Related cgroup Leaks in a Kubernetes Cluster

This article walks through a real‑world Kubernetes incident where a node ran out of space due to Ceph storage inconsistencies and cgroup leaks, detailing step‑by‑step diagnostics, Ceph repair commands, pod eviction, node reboot, and post‑mortem recommendations for cluster operations.

Efficient Ops

Nov 23, 2022

How to Diagnose and Fix Node2 Ceph‑Related cgroup Leaks in a Kubernetes Cluster

Background

Received an alert from the test environment cluster and logged into the Kubernetes cluster for investigation.

Fault Diagnosis

2.1 Check Pods

Observed abnormal Calico pod on the kube‑system node2.

Detailed inspection revealed that node2 had no storage space and a cgroup leak.

2.2 Check Storage

Logged into node2 to view server storage information; space appeared sufficient.

The cluster uses Ceph distributed storage, so the Ceph cluster status was examined.

Operations

3.1 Ceph Repair

Detected Ceph cluster anomalies that could cause node2 cgroup leaks and performed a manual Ceph repair.

Data inconsistency (incorrect object size or missing objects after recovery) can lead to scrub errors.

Ceph may encounter mismatched object size information during storage, causing cleanup failures.

Identified problematic PG 1.7c and repaired it. ceph pg repair 1.7c After repair, the Ceph cluster recovered.

3.2 Pod Repair

Deleted the abnormal pod; the controller automatically recreated the latest pod.

Pod remained unchanged, likely due to Ceph issues causing node2 cgroup leaks; further research suggested kernel version too low (Linux 3.10.0‑862.el7.x86_64) and disabling kmem could help.

3.3 Further Fault Diagnosis

During container startup, runc enables kmem accounting by default, which can cause leaks on kernel 3.10.

Rebooting the server with “no space left” resolves the issue, possibly triggered by mass pod deletions.

3.4 Node2 Maintenance

3.4.1 Mark node2 as unschedulable

kubectl cordon node02

3.4.2 Drain pods from node2

kubectl drain node02 --delete-local-data --ignore-daemonsets --force

Options explained:

--delete-local-data: delete local data, including emptyDir.

--ignore-daemonsets: ignore DaemonSets to prevent automatic recreation.

--force: force deletion of all pods, including those managed by ReplicationController, ReplicaSet, DaemonSet, StatefulSet, or Job.

All pods on node2 were successfully evicted.

During migration, pods are rebuilt before termination, so service interruption time equals rebuild time plus startup time plus readiness probe time; the service is considered normal only after reaching 1/1 Running.

3.4.3 Reboot node02

After reboot, node02 was restored and ready for scheduling.

kubectl uncordon node02

Reflection

Future work includes upgrading the kernel of the Kubernetes cluster.

Pod anomalies may stem from underlying storage issues; precise diagnosis and targeted fixes are essential.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Kubernetes Ceph Cluster Operations cgroup leak Node troubleshooting

Written by

Efficient Ops

This public account is maintained by Xiaotianguo and friends, regularly publishing widely-read original technical articles. We focus on operations transformation and accompany you throughout your operations career, growing together happily.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.