Cloud Native 18 min read

Mastering Multi‑Cluster Management with Open Cluster Management (OCM)

Open Cluster Management (OCM) is an open‑source, hub‑agent solution that simplifies lifecycle, configuration, and policy management for multiple Kubernetes clusters across hybrid and multi‑cloud environments, detailing its architecture, core primitives, advantages, and real‑world deployments by Ant Group and Alibaba Cloud.

Alibaba Cloud Native
Alibaba Cloud Native
Alibaba Cloud Native
Mastering Multi‑Cluster Management with Open Cluster Management (OCM)

Introduction

Open Cluster Management (OCM) is an open‑source project that provides a unified lifecycle‑management platform for resources, applications, configurations, and policies across many Kubernetes clusters. The project is under CNCF Sandbox incubation.

History of Multi‑Cluster Management

Early federation efforts such as KubeFed v1 (Red Hat, Google) and KubeFed v2 (Red Hat, IBM) highlighted common requirements:

Geographic distribution across heterogeneous infrastructures.

Scalability limits of a single cluster (etcd latency, node count).

Disaster‑recovery and isolation, including multi‑tenant isolation via API Priority and Fairness.

OCM Core Functions and Architecture

OCM abstracts five essential capabilities for any multi‑cluster manager:

Definition of a managed cluster.

Selection of one or more clusters via placement policies.

Distribution of configurations or workloads to the selected clusters.

Governance of user access to clusters.

Deployment of management add‑ons to clusters.

OCM follows a hub‑agent model built on the following primitive APIs:

ManagedCluster API : defines managed clusters; OCM installs a Klusterlet agent in each cluster for registration and lifecycle handling.

Placement API : decides where configurations or workloads should be scheduled; results are stored in the PlacementDecision API.

ManifestWork API : describes the resources to be applied to a target cluster.

ManagedClusterSet API : groups clusters and defines access boundaries.

ManagedClusterAddon API : governs how add‑ons are deployed and communicate securely with the hub.

The architecture consists of three main components:

Registration : cluster registration, lifecycle, and add‑on management.

Placement : workload scheduling across clusters.

Work : resource distribution.

Developers and SREs can use these primitives to build custom multi‑cluster tools.

OCM architecture diagram
OCM architecture diagram

Key Advantages

Modular Design

OCM’s core services provide cluster‑metadata abstraction, while optional modules (e.g., cluster grouping, resource distribution) can be added or removed, enabling incremental adoption.

Broad Compatibility

OCM integrates third‑party solutions such as Helm Chart deployment and an Addon framework, allowing extensions like Submariner for multi‑cluster networking.

Upcoming Features

ManagedClusterAction : atomic commands issued to individual clusters.

ManagedClusterView : projection of resources from managed clusters into the hub for dynamic decision‑making.

Real‑World Deployments

Ant Group

Ant Group uses OCM to manage dozens of on‑premise and cloud clusters. Notable capabilities include:

Certificate‑free registration : a pull‑based model eliminates the need to store cluster certificates.

Automated cluster registration through OCM’s approval workflow.

Automated resource install/uninstall pipelines for both cluster‑scoped and namespace‑scoped resources.

During large‑scale events such as Double‑Eleven, OCM automatically provisions and tears down clusters.

Alibaba Cloud

Alibaba Cloud leverages OCM as a core dependency of KubeVela, a CNCF‑hosted cloud‑native application platform. KubeVela provides end‑to‑end application delivery, gray‑release, autoscaling, and observability across multiple clusters. OCM handles cluster registration, governance, and resource distribution, enabling seamless hybrid‑cloud deployments.

Key use cases include:

One‑click hybrid environment provisioning (public ACK cluster + on‑premise cluster).

Multi‑cluster micro‑service application delivery using OCM’s distribution policies.

Community and Technical References

Source code and issue tracking are hosted at: https://github.com/open-cluster-management-io Relevant Kubernetes enhancement proposals:

KEP‑2149 Cluster ID:

https://github.com/Kubernetes/enhancements/tree/master/keps/sig-multicluster/2149-clusterid

KEP‑1645 Multi‑Cluster Services API (clusterset):

https://github.com/Kubernetes/enhancements/tree/master/keps/sig-multicluster/1645-multi-cluster-services-api

Work API:

https://github.com/Kubernetes-sigs/work-api
OCM community diagram
OCM community diagram
Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Cloud NativeKubernetesMulti-Clusterhybrid cloudCluster FederationOpen Cluster Management
Alibaba Cloud Native
Written by

Alibaba Cloud Native

We publish cloud-native tech news, curate in-depth content, host regular events and live streams, and share Alibaba product and user case studies. Join us to explore and share the cloud-native insights you need.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.