Advanced DaemonSet in OpenKruise: Granular Rolling Updates for Large Clusters
OpenKruise’s Advanced DaemonSet extends native Kubernetes DaemonSet with sophisticated rollout controls—node‑level canary selectors, partition‑based quantity pacing, surging updates, and pause capabilities—enabling safe, fine‑grained, high‑availability deployments across massive, heterogeneous clusters, illustrated with concrete YAML examples and API definitions.
Overview
OpenKruise extends the native Kubernetes DaemonSet with an Advanced DaemonSet controller that provides production‑grade rollout capabilities for large, heterogeneous clusters.
Key API Additions
type RollingUpdateType string
const (
StandardRollingUpdateType RollingUpdateType = "Standard"
SurgingRollingUpdateType RollingUpdateType = "Surging"
)
type RollingUpdateDaemonSet struct {
Type RollingUpdateType `json:"rollingUpdateType,omitempty"`
MaxUnavailable *intstr.IntOrString `json:"maxUnavailable,omitempty"`
Selector *metav1.LabelSelector `json:"selector,omitempty"`
Partition *int32 `json:"partition,omitempty"`
Paused *bool `json:"paused,omitempty"`
MaxSurge *intstr.IntOrString `json:"maxSurge,omitempty"`
}
type DaemonSetSpec struct {
// ... other fields ...
BurstReplicas *intstr.IntOrString `json:"burstReplicas,omitempty"`
}Rollout Strategies
Node‑level Canary (selector) Use a label selector to target a subset of nodes. Only nodes matching selector.matchLabels receive the new Pod version.
apiVersion: apps.kruise.io/v1alpha1
kind: DaemonSet
spec:
updateStrategy:
type: RollingUpdate
rollingUpdate:
selector:
matchLabels:
nodeType: canaryQuantity‑based Partition Define partition to keep a fixed number of Pods on the old version. The controller updates status.DesiredNumberScheduled - partition Pods at a time.
apiVersion: apps.kruise.io/v1alpha1
kind: DaemonSet
spec:
updateStrategy:
type: RollingUpdate
rollingUpdate:
partition: 100For a DaemonSet on 120 nodes, partition: 100 upgrades only 20 Pods initially.
Multi‑dimensional Gray‑scale Combine selector , partition , and optionally maxUnavailable . The controller first filters nodes by the selector, then applies the partition count within that subset, and finally caps concurrent updates with maxUnavailable .
apiVersion: apps.kruise.io/v1alpha1
kind: DaemonSet
spec:
updateStrategy:
type: RollingUpdate
rollingUpdate:
maxUnavailable: 5
partition: 100
selector:
matchLabels:
nodeType: canaryHot Upgrade (Surging) Set type: Surging to create new Pods before deleting old ones. maxSurge (absolute number or percentage) limits the number of extra Pods that can run concurrently, enabling zero‑downtime upgrades.
apiVersion: apps.kruise.io/v1alpha1
kind: DaemonSet
spec:
updateStrategy:
rollingUpdate:
type: Surging
maxSurge: 30%Publish Pause Set paused: true to stop the controller from processing further updates.
apiVersion: apps.kruise.io/v1alpha1
kind: DaemonSet
spec:
updateStrategy:
rollingUpdate:
paused: trueUsage Example
A typical Advanced DaemonSet manifest combining selector, partition, and maxUnavailable:
apiVersion: apps.kruise.io/v1alpha1
kind: DaemonSet
metadata:
name: my-daemon
spec:
selector:
matchLabels:
app: my-daemon
template:
metadata:
labels:
app: my-daemon
spec:
containers:
- name: daemon
image: my-image:v2
updateStrategy:
type: RollingUpdate
rollingUpdate:
selector:
matchLabels:
nodeType: canary
partition: 80
maxUnavailable: 10Repository
Source code and releases are hosted at https://github.com/openkruise/kruise.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Alibaba Cloud Native
We publish cloud-native tech news, curate in-depth content, host regular events and live streams, and share Alibaba product and user case studies. Join us to explore and share the cloud-native insights you need.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
