Cloud Native 13 min read

How to Detect and Resolve Kernel Memory & CPU Latency in Kubernetes Clusters

In cloud‑native Kubernetes environments, resource over‑commit and mixed deployments can cause kernel‑level memory reclaim and CPU scheduling delays that manifest as application jitter, and this article explains how to visualize, diagnose, and remediate those delays using the SysOM exporter and related metrics.

Alibaba Cloud Infrastructure
Alibaba Cloud Infrastructure
Alibaba Cloud Infrastructure
How to Detect and Resolve Kernel Memory & CPU Latency in Kubernetes Clusters

Background

In cloud‑native scenarios, many clusters adopt resource over‑commit and mixed deployment to maximize utilization. While this improves efficiency, it also raises contention between the host and containerized applications, leading to kernel‑level delays such as CPU latency and Memory Reclaim Latency that propagate to the application layer, causing response‑time jitter or even service disruption.

Memory Reclaim Latency

When a process requests memory and the system or container reaches a low‑watermark, the kernel triggers asynchronous reclamation (kswapd). If memory falls below a minimum watermark, the kernel enters direct reclaim and direct compaction phases, which can block the process for a noticeable period.

Direct reclaim : the process blocks while the kernel synchronously reclaims memory because of severe memory pressure.

Direct compaction : the process blocks while the kernel compacts fragmented memory into a contiguous region.

Both actions increase CPU usage and can cause long‑lasting latency spikes, leading to jitter in latency‑sensitive workloads.

Typical Delay Scenarios

CASE 1: Container memory limit reached → direct reclaim/compaction blocks the container process.

CASE 2: Host memory shortage → node memory below min watermark triggers direct reclaim for containers.

CASE 3: Long run‑queue wait time → processes stay in the ready queue too long before being scheduled.

CASE 4: Prolonged interrupt handling → heavy interrupt storms keep the CPU occupied, preventing timely scheduling.

CASE 5: Kernel path holding spin locks → long‑running kernel paths block soft‑IRQ processing, causing network jitter.

Identifying System Delays with SysOM

The ACK team collaborated with the OS team to launch SysOM (System Observer Monitoring) , a kernel‑level container monitoring feature available on Alibaba Cloud. The SysOM dashboards provide visibility into both node‑level and pod‑level metrics.

In the Pod Memory Monitor view, watch Memory Global Direct Reclaim Latency , Memory Direct Reclaim Latency , and Memory Compact Latency to see how long pods are blocked by direct reclaim or compaction.

In the System Memory node view, the Memory Others chart shows the page‑scan count ( pgscan_direct) during direct reclaim; a non‑zero value indicates reclaim activity.

Linux memory watermarks
Linux memory watermarks

Metric Details

Memory Direct Reclaim Latency reports the incremental count of reclaim events grouped by latency ranges (e.g., memDrcm_lat_1to10ms, memDrcm_glb_lat_10to100ms) triggered when container memory usage hits its limit or node free memory drops below the min watermark.

Memory Compact Latency reflects the incremental count of compaction events caused by excessive node memory fragmentation.

Resolving Memory‑Related Delays

Use the Node/Pod Memory Panorama feature to break down memory consumption (Pod Cache, InactiveFile, InactiveAnon, Dirty Memory) and locate memory “black holes”.

Enable Koordinator QoS fine‑grained scheduling to adjust memory watermarks and trigger earlier asynchronous reclamation, reducing the impact of direct reclaim.

Memory panorama
Memory panorama

CPU Scheduling Delay

CPU delay is the interval from a task becoming runnable to being selected by the scheduler. Prolonged CPU delay can cause network‑level latency (e.g., delayed packet processing).

Monitor the System CPU and Schedule dashboard for metrics such as WaitOnRunq Delay (average time processes spend in the run‑queue) and Sched Delay Count (distribution of intervals with no scheduling activity). Spikes above 50 ms indicate serious scheduling jitter.

CPU scheduling dashboard
CPU scheduling dashboard

Case Study: CPU Delay Causing Network Jitter

A financial‑industry customer observed frequent Redis connection failures on two ACK nodes. Investigation revealed kernel packet‑receive latency > 500 ms, leading to Redis client disconnects.

Examining the Sched Delay Count chart showed many > 1 ms spikes, suggesting prolonged CPU idle periods where ksoftirq could not run.

OS console diagnostics displayed both scheduling jitter and cgroup leak anomalies.

Further analysis linked the issue to a memory cgroup leak caused by a cronjob that read logs, leaving page cache in a zombie cgroup.

Resolution steps:

Temporary: drop caches to free page cache and allow the zombie cgroup to be cleaned.

Permanent: enable Alinux’s zombie‑cgroup reclamation feature (see reference [5]).

Sched Delay Count chart
Sched Delay Count chart

Further Diagnosis Tools

For deeper root‑cause analysis of scheduling jitter, use the Scheduling Jitter Diagnosis feature in the Alibaba Cloud OS console.

References

SysOM kernel‑level container monitoring: https://help.aliyun.com/zh/ack/ack-managed-and-ack-dedicated/user-guide/sysom-kernel-level-container-monitoring

Memory Panorama analysis: https://help.aliyun.com/zh/alinux/user-guide/memory-panorama-analysis-function-instructions

Container memory QoS: https://help.aliyun.com/zh/ack/ack-managed-and-ack-dedicated/user-guide/memory-qos-for-containers

Scheduling jitter diagnosis: https://help.aliyun.com/zh/alinux/user-guide/scheduling-jitter-diagnosis

Alinux resource isolation guide: https://openanolis.cn/sig/Cloud-Kernel/doc/659601505054416682

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

KubernetesCPU schedulingSysOMMemory reclaimKernel latency
Alibaba Cloud Infrastructure
Written by

Alibaba Cloud Infrastructure

For uninterrupted computing services

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.