Cloud Native 11 min read

Master Progressive Delivery with OpenKruise Rollout: A Step‑by‑Step Guide

This article explains how OpenKruise Rollout enables non‑intrusive, extensible progressive delivery on Kubernetes, covering its architecture, design goals, traffic scheduling, metric‑driven automation, and a complete canary release walkthrough with code examples.

Alibaba Cloud Native
Alibaba Cloud Native
Alibaba Cloud Native
Master Progressive Delivery with OpenKruise Rollout: A Step‑by‑Step Guide

Introduction

OpenKruise is an open‑source Cloud Native application automation suite from Alibaba Cloud, now a CNCF Sandbox project. It extends Kubernetes with components such as Kruise Rollout to support progressive delivery.

What is Progressive Delivery?

The term originates from large‑scale industrial projects and means breaking a complex project into phased, small‑loop iterations to reduce cost and time. In the cloud‑native era, it is realized through techniques like A/B testing, canary or gray releases.

Why Kruise Rollout?

Kubernetes only provides Deployment, Ingress and Service, but lacks an out‑of‑box progressive‑delivery controller. Existing solutions (Argo‑Rollout, Flagger) have limitations: they only support Deployment, are not non‑intrusive, and may duplicate resources. Kruise Rollout aims to be non‑intrusive, extensible, and easy to use.

Design Goals

Non‑intrusive : does not modify native workload resources.

Extensible : supports native workloads (Deployment, CloneSet, DaemonSet) and custom traffic routers (Nginx, Istio, ALB, etc.).

Usable : works out‑of‑the‑box with GitOps or self‑hosted PaaS.

Kruise Rollout Model

Kruise Rollout defines a Rollout CRD that can drive canary, blue‑green, and A/B testing releases, automatically pausing or advancing based on Prometheus metrics. It works with multiple workload types and traffic routers.

Kruise Rollout architecture
Kruise Rollout architecture

Traffic Scheduling and Batch Release

workloadRef selects the target workload (Deployment, CloneSet, DaemonSet).

canary.Steps defines five batches; the first routes 5 % traffic and requires manual approval.

The second batch routes 40 % traffic and pauses 10 minutes before the next batch.

trafficRoutings can be extended to support Nginx, Istio, ALB, etc.

apiVersion: rollouts.kruise.io/v1alpha1
kind: Rollout
spec:
  strategy:
    objectRef:
      workloadRef:
        apiVersion: apps/v1
        kind: Deployment
        name: echoserver
    canary:
      steps:
      - weight: 5
        pause: {}
        replicas: 1
      - weight: 40
        pause: {duration: 600}
      - weight: 60
        pause: {duration: 600}
      - weight: 80
        pause: {duration: 600}
      trafficRoutings:
      - service: echoserver
        type: nginx
        ingress:
          name: echoserver

Metrics‑Driven Automatic Pausing

During rollout, Prometheus metrics are evaluated after each batch. If the success rate of HTTP 200 responses falls below 99.5 %, the rollout pauses.

apiVersion: rollouts.kruise.io/v1alpha1
kind: AnalysisTemplate
metadata:
  name: success-rate
spec:
  args:
  - name: service-name
  metrics:
  - name: success-rate
    interval: 5m
    successCondition: result[0] >= 0.95
    failureLimit: 3
    provider:
      prometheus:
        address: http://prometheus.example.com:9090
        query: |
          sum(irate(istio_requests_total{reporter="source",destination_service=~"{{args.service-name}}",response_code!~"5.*"}[5m])) /
          sum(irate(istio_requests_total{reporter="source",destination_service=~"{{args.service-name}}"}[5m]))

Canary Release Walkthrough

1. Deploy an echo‑server service with an Nginx ingress.

2. Create a Rollout that releases 5 % of traffic to a new pod version.

apiVersion: rollouts.kruise.io/v1alpha1
kind: Rollout
metadata:
  name: rollouts-demo
spec:
  objectRef: ...
  strategy:
    canary:
      steps:
      - weight: 5
        pause: {}
        replicas: 1
      trafficRoutings: ...

3. Update the image version; the controller automatically creates a canary Deployment, Service and Ingress, routing 5 % traffic.

Canary rollout created
Canary rollout created

4. After verification, approve the rollout with:

kubectl-kruise rollout approve rollout/rollouts-demo -n default

The controller then rolls out the remaining pods and cleans up canary resources.

Full rollout completed
Full rollout completed

5. If a problem is detected, revert the image version; the controller rolls back by removing canary resources.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: echoserver
spec:
  containers:
  - name: echoserver
    image: cilium/echoserver:1.10.2
Rollback example
Rollback example

Conclusion

As Kubernetes workloads grow, balancing rapid iteration with stability becomes critical. Kruise Rollout provides a native, non‑intrusive progressive‑delivery solution, supporting traffic scheduling, batch releases, and metric‑driven automation. Version 0.1.0 is released and integrated with the OAM KubeVela project.

Resources: GitHub repository https://github.com/openkruise/rollouts, official site https://openkruise.io/.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Cloud NativeKubernetescanary deploymentRolloutOpenKruiseProgressive Delivery
Alibaba Cloud Native
Written by

Alibaba Cloud Native

We publish cloud-native tech news, curate in-depth content, host regular events and live streams, and share Alibaba product and user case studies. Join us to explore and share the cloud-native insights you need.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.