Cloud Native 16 min read

Why Does Containerd’s PLEG Relisting Stall at Node Startup and How to Fix It

When replacing dockershim with containerd, we observed that pods take over a minute to start because the GenericPLEG Relisting operation stalls for more than 30 seconds during node boot, caused by containerd’s UpdateContainerResources holding a bbolt lock and intensive image pulls; the article explains the root cause and provides a fix using the overlay volatile mount option.

Efficient Ops

Oct 18, 2023

Why Does Containerd’s PLEG Relisting Stall at Node Startup and How to Fix It

Technical Background

In recent internal tests of replacing dockershim with containerd, we noticed that business containers take a long time to become runnable after the pod starts. The init container finishes within a second, but the main containers sometimes need more than a minute before they start executing.

Examining kubelet logs revealed that, when a node first boots, the PLEG (Pod Lifecycle Event Generator) Relisting method—normally executed once per second—takes over 30 seconds to complete. After a few minutes the issue disappears and Relisting runs at the expected one‑second interval.

dockershim and CRI

Kubernetes 1.24 removed the dockershim component from kubelet, allowing users to choose containerd or CRI‑O as the container runtime. Containerd’s architecture evolved accordingly.

PLEG

PLEG (Pod Lifecycle Event Generator) runs on each node to keep the actual state of pods and containers in sync with the desired spec. It reduces unnecessary work during idle periods and lowers the number of concurrent requests to the container runtime.

Pod spec state

Container runtime state

ImagePull Process

The steps performed by ctr image pull are:

Resolve the image to be downloaded.

Pull the image from the registry, storing layers and config in the content service and metadata in the images service.

Unpack the layers into the snapshot service.

Note: the content and images services are gRPC services provided by containerd; during layer unpacking containerd temporarily mounts and unmounts all parent snapshots.

Problem Diagnosis

Based on the background, the GenericPLEG: Relisting call queries containerd’s CRI to obtain the list of running containers. Containerd logs show errors such as:

containerd[2206]: {"error":"failed to stop container: failed to delete task: context deadline exceeded: unknown","level":"error","msg":"failed to handle container TaskExit event &TaskExit{ContainerID:...}"}

Goroutine dumps reveal a goroutine waiting on a Delete call, and another stuck in an umount system call.

goroutine 1654 [select]:
github.com/containerd/ttrpc.(*Client).dispatch(...)
... (stack trace omitted for brevity)

Further investigation of containerd.log shows that UpdateContainerResources requests are blocked waiting for a bbolt lock:

goroutine 1723 [semacquire]:
sync.runtime_SemacquireMutex(...)
... (stack trace omitted)

The relevant source code resides in containerd/pkg/cri/server/container_update_resources.go and holds the container status lock while updating resources. The ListContainers operation also needs this lock, causing PLEG to stall.

Because the lock is held while the bbolt database syncs data to storage, I/O pressure on the host can exacerbate the delay. Monitoring tools such as PSI or iostat can surface the pressure.

Problem Fix

The community provided a fix in PR #8676: add a mount option volatile to the overlay filesystem. This option skips the sync call during umount, preventing the long pause.

Note: the volatile mount option allows overlayfs to avoid forced disk sync on unmount, reducing latency.

Applying the overlay volatile option mitigates the startup delay even when many images are pulled.

Disclaimer: the author’s time and perspective are limited; readers are encouraged to provide feedback and corrections.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Kubernetes containerd container runtime PLEG

Written by

Efficient Ops

This public account is maintained by Xiaotianguo and friends, regularly publishing widely-read original technical articles. We focus on operations transformation and accompany you throughout your operations career, growing together happily.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.