Mastering Kubernetes CRDs and Operators: From Basics to Real-World Practices
This article explores Kubernetes Custom Resource Definitions (CRDs) and Operators, explaining their origins, how they enable custom resources, the operator concept, practical examples like a smart‑home light controller, and guidance on building operators with frameworks such as KubeBuilder and Operator‑SDK.
This article is part of the "Kubernetes Resource Orchestration Series" and focuses on Custom Resource Definitions (CRDs) and Operators, two core concepts that extend the Kubernetes API and automate application lifecycle management.
1. What is a CRD?
When the built‑in Kubernetes resource types are insufficient for a specific business need, developers create Custom Resources (CR). A Custom Resource Definition (CRD) acts as a specification that tells the Kubernetes API server how to recognize and store these new resources.
The idea originated from Google’s "Third Party Resource" concept, which aimed to extend the API object model via plugins. Starting with Kubernetes 1.7, the CRD concept replaced Third Party Resources.
A typical CRD YAML uses an OpenAPI v3 schema to describe the fields of the custom resource, similar to a strongly‑typed declaration in a programming language.
apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
name: lights.light.sreworks.io
spec:
group: light.sreworks.io
names:
kind: Light
plural: lights
scope: Namespaced
versions:
- name: v1
served: true
storage: true
schema:
openAPIV3Schema:
type: object
properties:
spec:
type: object
properties:
company:
type: string
...With a CRD in place, users can create resources that sit alongside native Kubernetes objects, enabling scenarios such as managing jobs, routes, accounts, and more directly in the cluster.
2. What is an Operator?
In everyday language, an operator is a person or a piece of software that performs operations. In Kubernetes, an Operator encodes human operational knowledge into software, allowing it to manage an application’s full lifecycle (install, upgrade, scale, delete, etc.).
WHAT IS AN OPERATOR AFTER ALL? An Operator represents human operational knowledge in software, to reliably manage an application. They are methods of packaging, deploying, and managing a Kubernetes application.
Effectively, an Operator is a controller that watches custom resources (CRs) and reconciles their desired state with the actual state of the cluster.
Example: a light‑control CRD and an accompanying Operator can turn a physical lamp on or off based on the power field in a YAML manifest.
apiVersion: v1
kind: Light
metadata:
name: bedroom
spec:
power: on
brightness: 70
colorTemperature: 5000kFrom the user’s perspective, only the final state (e.g., power: on) matters; the Operator handles retries, back‑off, and other operational details behind the scenes.
3. Implementing a Kubernetes Operator
Operators differ from plain YAML/Helm/Kustomize because they run inside the API server’s control loop, listening to CR changes and executing custom code to maintain state. Developing an Operator involves:
Designing and installing the CRD.
Writing the controller logic (often in Go) using frameworks such as KubeBuilder or Operator‑SDK .
Packaging, deploying, and integrating the Operator with CI/CD pipelines.
Typical architecture: the Operator contains a controller that watches CR events, enqueues them, and processes them sequentially. The following diagram illustrates this flow.
4. Real‑World Example: Spark Operator
The Spark Operator demonstrates how a big‑data workload can be managed via CRDs. Instead of invoking spark-submit, users submit a SparkJob YAML; the Operator watches the CR and launches the Spark job on the cluster.
5. A Generic Big‑Data Operator Design
To address common patterns across distributed applications, a generic big‑data Operator was built. Its design follows three stages: perception, decision, and execution. It also introduces a VirtualResource concept inspired by React’s Virtual DOM, allowing controllers to query resources with SQL‑like syntax.
The implementation prefers YAML‑based declarative logic over Go code. Below is a sample operator configuration that uses Helm to render resources and watches Service changes to update an Ingress.
default:
def: crd.yaml
deploy:
- cmd: helm
chart: vvp/vvp
values: vvp/values.yaml
maintain:
- watch:
category: ResourceDidChange
kind: Service
apiVersion: v1
action:
- cmd: kube-patch
file: ingressUpdate.yaml6. Summary
CRD + Operator is the most powerful, albeit complex, mechanism for extending Kubernetes when Helm or Kustomize fall short. Operators complement these tools, handling lifecycle management while Helm/Kustomize manage templating and packaging. As cloud‑native ecosystems mature, Operators will continue to play a pivotal role in bridging traditional applications to Kubernetes.
Alibaba Cloud Big Data AI Platform
The Alibaba Cloud Big Data AI Platform builds on Alibaba’s leading cloud infrastructure, big‑data and AI engineering capabilities, scenario algorithms, and extensive industry experience to offer enterprises and developers a one‑stop, cloud‑native big‑data and AI capability suite. It boosts AI development efficiency, enables large‑scale AI deployment across industries, and drives business value.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
