Cloud Computing 8 min read

How NRI Enhances Koordinator’s Container QoS Management at KubeCon China 2023

The article explains how Koordinator and containerd integrate the Node Resource Interface (NRI) to improve container QoS, avoid noisy‑neighbor issues, and provide a flexible, standardized resource‑management model for cloud‑native workloads, as presented at KubeCon China 2023.

Alibaba Cloud Native
Alibaba Cloud Native
Alibaba Cloud Native
How NRI Enhances Koordinator’s Container QoS Management at KubeCon China 2023

Overview

This summary explains how the open‑source Koordinator project integrates with the Containerd Node Resource Interface (NRI) to provide fine‑grained QoS control for mixed‑workload clusters without modifying the Kubelet. The integration works with any CRI‑compatible runtime such as containerd or cri‑o and enables plug‑in logic to adjust container specifications and perform out‑of‑OCI actions at defined hook points.

Koordinator background

Koordinator originated from Alibaba’s mixed‑workload research that began in 2011 and was re‑architected in 2016. It was open‑sourced in April 2022 and now powers a cloud‑native scheduler that manages tens of millions of CPU cores across Alibaba’s data centers. Reported average CPU utilization exceeds 50 % at cloud scale, and the system improves latency‑sensitive and batch‑job reliability by consolidating idle resources.

Node Resource Interface (NRI)

NRI is a public, runtime‑agnostic interface that extends CRI‑compatible container runtimes. It defines a protobuf‑based protocol (implemented with ttrpc) exchanged over a Unix‑domain socket. An NRI plugin runs as a daemon‑like process and receives all lifecycle events (create, start, update, delete) from the runtime. Because the interface is independent of the underlying runtime, the same plugin can be used with both containerd and cri‑o.

The architecture consists of:

Runtime adapter library – compiled into the CRI runtime to translate CRI calls into NRI events.

NRI daemon – loads one or more plugins, maintains state, and communicates with the runtime via the Unix socket.

Plugins – implement custom business logic (e.g., device allocation, QoS policy enforcement) and may modify the container Spec or perform side‑effects.

NRI architecture diagram
NRI architecture diagram

Integrating NRI with containerd

To enable NRI in a containerd installation, follow these steps:

Clone and build the NRI repository:

git clone https://github.com/containerd/nri.git
cd nri
make

Install the NRI plugin binary (e.g., nri-plugin-example) to a directory accessible by the containerd service.

Edit /etc/containerd/config.toml to load the NRI plugin. Example snippet:

[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc]
  runtime_type = "io.containerd.runc.v2"
  [plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc.options]
    SystemdCgroup = true

[plugins."io.containerd.nri".default]
  socket = "/run/nri.sock"
  plugin_path = "/usr/local/bin/nri-plugin-example"
  enabled = true

Start the NRI plugin as a background service (systemd unit or manual nohup), ensuring it creates the Unix socket defined above.

Restart containerd so it loads the NRI adapter and begins sending events to the plugin.

For cri‑o, the process is analogous: add the NRI adapter configuration to /etc/crio/crio.conf and point the nri section to the plugin binary and socket.

Benefits of the NRI plug‑in model

Non‑intrusive : No changes to the Kubelet code base are required.

Runtime‑agnostic : The same plugin can be reused across different CRI runtimes.

Stateful processing : Plugins can maintain internal state across container lifecycle events, enabling sophisticated scheduling decisions.

Low overhead : Communication via ttrpc and protobuf minimizes latency.

References

Official NRI source code and documentation: https://github.com/containerd/nri

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

KubernetesResource ManagementcontainerdKoordinatorNRI
Alibaba Cloud Native
Written by

Alibaba Cloud Native

We publish cloud-native tech news, curate in-depth content, host regular events and live streams, and share Alibaba product and user case studies. Join us to explore and share the cloud-native insights you need.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.