How NRI Enhances Koordinator’s Container QoS Management at KubeCon China 2023
The article explains how Koordinator and containerd integrate the Node Resource Interface (NRI) to improve container QoS, avoid noisy‑neighbor issues, and provide a flexible, standardized resource‑management model for cloud‑native workloads, as presented at KubeCon China 2023.
Overview
This summary explains how the open‑source Koordinator project integrates with the Containerd Node Resource Interface (NRI) to provide fine‑grained QoS control for mixed‑workload clusters without modifying the Kubelet. The integration works with any CRI‑compatible runtime such as containerd or cri‑o and enables plug‑in logic to adjust container specifications and perform out‑of‑OCI actions at defined hook points.
Koordinator background
Koordinator originated from Alibaba’s mixed‑workload research that began in 2011 and was re‑architected in 2016. It was open‑sourced in April 2022 and now powers a cloud‑native scheduler that manages tens of millions of CPU cores across Alibaba’s data centers. Reported average CPU utilization exceeds 50 % at cloud scale, and the system improves latency‑sensitive and batch‑job reliability by consolidating idle resources.
Node Resource Interface (NRI)
NRI is a public, runtime‑agnostic interface that extends CRI‑compatible container runtimes. It defines a protobuf‑based protocol (implemented with ttrpc) exchanged over a Unix‑domain socket. An NRI plugin runs as a daemon‑like process and receives all lifecycle events (create, start, update, delete) from the runtime. Because the interface is independent of the underlying runtime, the same plugin can be used with both containerd and cri‑o.
The architecture consists of:
Runtime adapter library – compiled into the CRI runtime to translate CRI calls into NRI events.
NRI daemon – loads one or more plugins, maintains state, and communicates with the runtime via the Unix socket.
Plugins – implement custom business logic (e.g., device allocation, QoS policy enforcement) and may modify the container Spec or perform side‑effects.
Integrating NRI with containerd
To enable NRI in a containerd installation, follow these steps:
Clone and build the NRI repository:
git clone https://github.com/containerd/nri.git
cd nri
makeInstall the NRI plugin binary (e.g., nri-plugin-example) to a directory accessible by the containerd service.
Edit /etc/containerd/config.toml to load the NRI plugin. Example snippet:
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc]
runtime_type = "io.containerd.runc.v2"
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc.options]
SystemdCgroup = true
[plugins."io.containerd.nri".default]
socket = "/run/nri.sock"
plugin_path = "/usr/local/bin/nri-plugin-example"
enabled = trueStart the NRI plugin as a background service (systemd unit or manual nohup), ensuring it creates the Unix socket defined above.
Restart containerd so it loads the NRI adapter and begins sending events to the plugin.
For cri‑o, the process is analogous: add the NRI adapter configuration to /etc/crio/crio.conf and point the nri section to the plugin binary and socket.
Benefits of the NRI plug‑in model
Non‑intrusive : No changes to the Kubelet code base are required.
Runtime‑agnostic : The same plugin can be reused across different CRI runtimes.
Stateful processing : Plugins can maintain internal state across container lifecycle events, enabling sophisticated scheduling decisions.
Low overhead : Communication via ttrpc and protobuf minimizes latency.
References
Official NRI source code and documentation: https://github.com/containerd/nri
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Alibaba Cloud Native
We publish cloud-native tech news, curate in-depth content, host regular events and live streams, and share Alibaba product and user case studies. Join us to explore and share the cloud-native insights you need.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
