Cloud Native 10 min read

Master Cloud‑Native Chaos Testing with Alibaba’s ChaosBlade: A Hands‑On Guide

This article introduces Alibaba's open‑source ChaosBlade tool, explains its experiment model and supported scenarios, shows how to install the ChaosBlade Operator on Kubernetes, and provides step‑by‑step instructions for creating, modifying, and cleaning up cloud‑native chaos experiments using both YAML resources and the blade CLI.

Alibaba Cloud Native
Alibaba Cloud Native
Alibaba Cloud Native
Master Cloud‑Native Chaos Testing with Alibaba’s ChaosBlade: A Hands‑On Guide

ChaosBlade is an open‑source chaos‑engineering tool from Alibaba that follows a declarative experiment model, making it easy to inject failures into cloud‑native systems and improve their fault tolerance.

Supported Experiment Scenarios

Basic resource scenarios: CPU load, memory consumption, disk I/O load, disk usage, network latency, packet loss, network blackout, DNS unreachability, shell script tampering, process kill, process hang, node reboot, etc.

Application service scenarios: rich Java components (Dubbo, RocketMQ, HttpClient, Servlet, Druid, etc.) and C++ components, with support for custom Java or Groovy scripts.

Container service scenarios: Kubernetes and Docker resources, covering node, pod, and container experiments such as pod network latency and packet loss.

Chaos Experiment Model

The model consists of four layers:

Target : the component under test (e.g., a container, a Dubbo service, Redis).

Scope : the machines or clusters where the experiment is applied.

Matcher : rules that match the target, such as specific RPC methods or Redis commands.

Action : the concrete fault injection (e.g., disk full, high I/O, latency, exception, return specific error code).

Example for a Dubbo service delay:

blade create dubbo delay --time 3000 --service com.example.HelloService --version 1.0.0

Implementation for Cloud‑Native Environments

Experiments are expressed as Kubernetes custom resources managed by the ChaosBlade Operator, which runs a daemonset of chaosblade-tool pods on each node. The operator copies the ChaosBlade binary into target containers when needed.

Installation

Install the operator with Helm:

helm install --namespace kube-system --name chaosblade-operator chaosblade-operator-0.0.1.tgz

Verify the pods:

kubectl get pod -n kube-system -o wide | grep chaosblade

Running Experiments

Two execution methods are supported: applying a YAML resource with kubectl or using the blade CLI directly.

YAML Example (CPU load 80% on a node)

apiVersion: chaosblade.io/v1alpha1
kind: ChaosBlade
metadata:
  name: cpu-load
spec:
  experiments:
  - scope: node
    target: cpu
    action: fullload
    desc: "increase node cpu load by names"
    matchers:
    - name: names
      value:
      - "cn-hangzhou.192.168.0.205"
    - name: cpu-percent
      value:
      - "80"

Apply the configuration: kubectl apply -f chaosblade_cpu_load.yaml Check the experiment status:

kubectl get blade cpu-load -o json

Blade CLI Example

blade create k8s node-cpu fullload --names cn-hangzhou.192.168.0.205 --cpu-percent 80 --kubeconfig ~/.kube/config

Modifying Experiments

Update the YAML (e.g., change cpu-percent from 80 to 60) and re‑apply:

apiVersion: chaosblade.io/v1alpha1
kind: ChaosBlade
metadata:
  name: cpu-load
spec:
  experiments:
  - scope: node
    target: cpu
    action: load
    desc: "cpu load"
    flags:
    - name: cpu-percent
      value: "60"
    - name: ip
      value: "192.168.0.34"
kubectl apply -f chaosblade_cpu_load.yaml

Stopping Experiments

Delete by resource name: kubectl delete chaosblade cpu-load Delete via YAML file: kubectl delete -f chaosblade_cpu_load.yaml Destroy with the blade CLI (requires the UID returned by blade create):

blade destroy <UID>

Uninstalling the Operator

helm del --purge chaosblade-operator

Summary

ChaosBlade provides a model‑driven, Kubernetes‑native chaos‑engineering solution that is easy to extend, deploy, and operate, allowing teams to proactively test system resilience.

Community Projects

ChaosBlade CLI (entry point)

Experiment model definition

Basic resource executors

Docker executor

Kubernetes executor

Java application executor

C++ application executor

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

KubernetesDevOpschaos engineeringChaosBlade
Alibaba Cloud Native
Written by

Alibaba Cloud Native

We publish cloud-native tech news, curate in-depth content, host regular events and live streams, and share Alibaba product and user case studies. Join us to explore and share the cloud-native insights you need.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.