Cloud Native 17 min read

After Years Using Kubernetes, I Finally Grasped CRDs – Build One from Scratch

The article reveals why most Kubernetes engineers use Custom Resource Definitions without truly understanding them, explains how CRDs act as the language that extends the Kubernetes API, and provides a step‑by‑step walkthrough to create a production‑ready DatabaseCluster CRD, interact with it via kubectl and the Python client, and avoid common pitfalls.

DevOps Coach
DevOps Coach
DevOps Coach
After Years Using Kubernetes, I Finally Grasped CRDs – Build One from Scratch

CRDs as the foundation of many Kubernetes tools

Argo CD Application, KEDA ScaledObject, Crossplane resources, and cert‑manager Certificate are all implemented as Custom Resource Definitions (CRDs). Recognizing this reveals that extending Kubernetes is fundamentally about defining new API objects.

Kubernetes as a language

CRDs add new vocabulary to the Kubernetes API. After a CRD is registered, resources such as DatabaseCluster, MLModel or TenantConfig become first‑class objects stored in etcd, queryable with kubectl and protected by RBAC.

Creating a CRD from scratch – DatabaseCluster

The minimal definition requires three sections:

group : the API group, e.g. infra.example.com names : plural, singular, kind and optional shortNames versions : each version declares served (accepts requests) and storage (persisted in etcd)

apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
  name: databaseclusters.infra.example.com
spec:
  group: infra.example.com
  scope: Namespaced
  names:
    plural: databaseclusters
    singular: databasecluster
    kind: DatabaseCluster
    shortNames:
    - dbc
  versions:
  - name: v1alpha1
    served: true
    storage: true
    schema:
      openAPIV3Schema:
        type: object
        properties:
          spec:
            type: object
            properties: {}
          status:
            type: object
            properties: {}
served

makes the version reachable via the API; storage designates the version that is written to etcd, enabling safe multi‑version upgrades.

Scope, served vs storage, and status subresource

CRDs can be Namespaced (resource lives in a namespace) or Cluster scoped (global). The served flag indicates the API version is active, while storage marks the version that actually persists objects—only one version may have storage: true. Adding a status subresource restricts writes to the status field to controllers, preventing users from overwriting system state.

CRD = vocabulary, Operator = behavior.

Registering and using the CRD

# Register the CRD
kubectl apply -f database-cluster-crd.yaml
# Verify registration
kubectl get crd databaseclusters.infra.example.com

After registration, create an instance like any native resource:

apiVersion: infra.example.com/v1alpha1
kind: DatabaseCluster
metadata:
  name: production-postgres
  namespace: databases
spec:
  engine: postgres
  replicas: 3
  region: ap-south-1
  storageGB: 100
  version: "16.2"

Custom columns defined in the CRD allow kubectl get dbc -n databases to display concise information.

Interacting with the CRD via the Python client

# pip install kubernetes
from kubernetes import client, config
config.load_kube_config()
custom_api = client.CustomObjectsApi()
GROUP = "infra.example.com"
VERSION = "v1alpha1"
PLURAL = "databaseclusters"
NAMESPACE = "databases"
# List resources
clusters = custom_api.list_namespaced_custom_object(group=GROUP, version=VERSION, namespace=NAMESPACE, plural=PLURAL)
for c in clusters["items"]:
    print(f"{c['metadata']['name']}: phase={c.get('status',{}).get('phase','Unknown')}, replicas={c['spec']['replicas']}")
# Create a new resource
new = {
    "apiVersion": f"{GROUP}/{VERSION}",
    "kind": "DatabaseCluster",
    "metadata": {"name": "staging-mysql", "namespace": NAMESPACE},
    "spec": {"engine": "mysql", "replicas": 1, "region": "ap-south-1", "storageGB": 20, "version": "8.0"}
}
custom_api.create_namespaced_custom_object(group=GROUP, version=VERSION, namespace=NAMESPACE, plural=PLURAL, body=new)
# Patch status (normally done by the operator)
status_patch = {"status": {"phase": "Provisioning"}}
custom_api.patch_namespaced_custom_object_status(group=GROUP, version=VERSION, namespace=NAMESPACE, plural=PLURAL, name="staging-mysql", body=status_patch)

This low‑level API gives full control before a higher‑level operator (e.g., kopf) abstracts the reconciliation loop.

Common pitfalls

Omitting schema lets the API accept malformed objects.

Missing status subresource allows users to overwrite system state.

Destructive schema changes without proper migration break existing resources.

A CRD without an operator performs no action.

Production‑grade DatabaseCluster CRD

Key enhancements:

Required fields: engine, replicas, region Validation rules: enums for engine, minimum/maximum for replicas, defaults for storageGB Status subresource with phase, endpoint, and a conditions array

Additional printer columns for Replicas, Region, Phase,

Age
# database-cluster-crd.yaml
apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
  name: databaseclusters.infra.example.com
  annotations:
    controller-gen.kubebuilder.io/version: v0.14.0
spec:
  group: infra.example.com
  scope: Namespaced
  names:
    plural: databaseclusters
    singular: databasecluster
    kind: DatabaseCluster
    shortNames:
    - dbc
  versions:
  - name: v1alpha1
    served: true
    storage: true
    subresources:
      status: {}
    additionalPrinterColumns:
    - name: Replicas
      type: integer
      jsonPath: .spec.replicas
    - name: Region
      type: string
      jsonPath: .spec.region
    - name: Phase
      type: string
      jsonPath: .status.phase
    - name: Age
      type: date
      jsonPath: .metadata.creationTimestamp
    schema:
      openAPIV3Schema:
        type: object
        properties:
          spec:
            type: object
            required: ["engine", "replicas", "region"]
            properties:
              engine:
                type: string
                enum: ["postgres", "mysql", "mariadb"]
                description: "Database engine to use"
              replicas:
                type: integer
                minimum: 1
                maximum: 9
                description: "Number of database replicas"
              region:
                type: string
                description: "AWS region or datacenter location"
              storageGB:
                type: integer
                minimum: 10
                default: 20
                description: "Storage size in GB"
              version:
                type: string
                description: "Database engine version"
          status:
            type: object
            properties:
              phase:
                type: string
                description: "Current lifecycle phase"
              endpoint:
                type: string
                description: "Connection endpoint when Ready"
              conditions:
                type: array
                items:
                  type: object
                  properties:
                    type:
                      type: string
                    status:
                      type: string
                    lastTransitionTime:
                      type: string
                      format: date-time
                    reason:
                      type: string
                    message:
                      type: string

Why schema matters

Without a schema, Kubernetes accepts any payload, turning the CRD into a bypass for validation. With a schema, the API server validates objects before they are stored, acting as the first line of defense.

Status subresource importance

The status subresource enforces a separation of concerns: users declare the desired state in spec, while the operator writes the actual state to status. This prevents accidental overwrites of system state.

Only the operator can modify status ; users cannot.

Dashboard‑style output with additional printer columns

Adding additionalPrinterColumns turns kubectl get dbc into a lightweight dashboard, showing columns such as REPLICAS, REGION, PHASE and AGE without a separate UI.

kubectl now displays concise, operator‑enriched information.

Common errors that can break a system

No schema – invalid objects are accepted.

No status subresource – users can overwrite operational state.

Destructive schema changes – existing resources become invalid.

Missing operator logic – the CRD does nothing on its own.

These are design issues in the API, not bugs in Kubernetes itself.

Design exercise – DatabaseBackup CRD

Example user manifest:

apiVersion: backup.example.com/v1
kind: DatabaseBackup
metadata:
  name: my-postgres-daily
spec:
  database: "postgres-prod"
  schedule: "0 2 * * *"   # Daily at 2 AM
  retention: 7               # Keep 7 backups

Key design considerations derived from the article:

Desired state : the user wants a backup to run on the specified schedule and retain the defined number of snapshots.

Actual state : the operator must track the number of existing backups, the timestamp of the last backup, its success/failure status, and any other relevant metadata.

Optional fields that may be added: storage location (S3, GCS, local), backup type (full or incremental), notification settings for success/failure, and any additional metadata required by the operator.

PythonKubernetesOperatorCRDkubectlCustomResourceDefinitionAPI extension
DevOps Coach
Written by

DevOps Coach

Master DevOps precisely and progressively.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.