Cloud Native 13 min read

Mastering Kubernetes CRDs and Operators: From Basics to Real-World Practices

This article explores Kubernetes Custom Resource Definitions (CRDs) and Operators, explaining their origins, how they enable custom resources, the operator concept, practical examples like a smart‑home light controller, and guidance on building operators with frameworks such as KubeBuilder and Operator‑SDK.

Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
Mastering Kubernetes CRDs and Operators: From Basics to Real-World Practices

This article is part of the "Kubernetes Resource Orchestration Series" and focuses on Custom Resource Definitions (CRDs) and Operators, two core concepts that extend the Kubernetes API and automate application lifecycle management.

1. What is a CRD?

When the built‑in Kubernetes resource types are insufficient for a specific business need, developers create Custom Resources (CR). A Custom Resource Definition (CRD) acts as a specification that tells the Kubernetes API server how to recognize and store these new resources.

The idea originated from Google’s "Third Party Resource" concept, which aimed to extend the API object model via plugins. Starting with Kubernetes 1.7, the CRD concept replaced Third Party Resources.

A typical CRD YAML uses an OpenAPI v3 schema to describe the fields of the custom resource, similar to a strongly‑typed declaration in a programming language.

apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
  name: lights.light.sreworks.io
spec:
  group: light.sreworks.io
  names:
    kind: Light
    plural: lights
  scope: Namespaced
  versions:
    - name: v1
      served: true
      storage: true
      schema:
        openAPIV3Schema:
          type: object
          properties:
            spec:
              type: object
              properties:
                company:
                  type: string
                ...

With a CRD in place, users can create resources that sit alongside native Kubernetes objects, enabling scenarios such as managing jobs, routes, accounts, and more directly in the cluster.

2. What is an Operator?

In everyday language, an operator is a person or a piece of software that performs operations. In Kubernetes, an Operator encodes human operational knowledge into software, allowing it to manage an application’s full lifecycle (install, upgrade, scale, delete, etc.).

WHAT IS AN OPERATOR AFTER ALL? An Operator represents human operational knowledge in software, to reliably manage an application. They are methods of packaging, deploying, and managing a Kubernetes application.

Effectively, an Operator is a controller that watches custom resources (CRs) and reconciles their desired state with the actual state of the cluster.

Example: a light‑control CRD and an accompanying Operator can turn a physical lamp on or off based on the power field in a YAML manifest.

apiVersion: v1
kind: Light
metadata:
  name: bedroom
spec:
  power: on
  brightness: 70
  colorTemperature: 5000k

From the user’s perspective, only the final state (e.g., power: on) matters; the Operator handles retries, back‑off, and other operational details behind the scenes.

3. Implementing a Kubernetes Operator

Operators differ from plain YAML/Helm/Kustomize because they run inside the API server’s control loop, listening to CR changes and executing custom code to maintain state. Developing an Operator involves:

Designing and installing the CRD.

Writing the controller logic (often in Go) using frameworks such as KubeBuilder or Operator‑SDK .

Packaging, deploying, and integrating the Operator with CI/CD pipelines.

Typical architecture: the Operator contains a controller that watches CR events, enqueues them, and processes them sequentially. The following diagram illustrates this flow.

Operator architecture diagram
Operator architecture diagram

4. Real‑World Example: Spark Operator

The Spark Operator demonstrates how a big‑data workload can be managed via CRDs. Instead of invoking spark-submit, users submit a SparkJob YAML; the Operator watches the CR and launches the Spark job on the cluster.

Spark Operator diagram
Spark Operator diagram

5. A Generic Big‑Data Operator Design

To address common patterns across distributed applications, a generic big‑data Operator was built. Its design follows three stages: perception, decision, and execution. It also introduces a VirtualResource concept inspired by React’s Virtual DOM, allowing controllers to query resources with SQL‑like syntax.

The implementation prefers YAML‑based declarative logic over Go code. Below is a sample operator configuration that uses Helm to render resources and watches Service changes to update an Ingress.

default:
  def: crd.yaml
  deploy:
    - cmd: helm
      chart: vvp/vvp
      values: vvp/values.yaml
  maintain:
    - watch:
        category: ResourceDidChange
        kind: Service
        apiVersion: v1
        action:
          - cmd: kube-patch
            file: ingressUpdate.yaml

6. Summary

CRD + Operator is the most powerful, albeit complex, mechanism for extending Kubernetes when Helm or Kustomize fall short. Operators complement these tools, handling lifecycle management while Helm/Kustomize manage templating and packaging. As cloud‑native ecosystems mature, Operators will continue to play a pivotal role in bridging traditional applications to Kubernetes.

cloud-nativeOperatorCRD
Alibaba Cloud Big Data AI Platform
Written by

Alibaba Cloud Big Data AI Platform

The Alibaba Cloud Big Data AI Platform builds on Alibaba’s leading cloud infrastructure, big‑data and AI engineering capabilities, scenario algorithms, and extensive industry experience to offer enterprises and developers a one‑stop, cloud‑native big‑data and AI capability suite. It boosts AI development efficiency, enables large‑scale AI deployment across industries, and drives business value.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.