Mastering Multi‑Cluster Management with Open Cluster Management (OCM)
Open Cluster Management (OCM) is an open‑source, hub‑agent solution that simplifies lifecycle, configuration, and policy management for multiple Kubernetes clusters across hybrid and multi‑cloud environments, detailing its architecture, core primitives, advantages, and real‑world deployments by Ant Group and Alibaba Cloud.
Introduction
Open Cluster Management (OCM) is an open‑source project that provides a unified lifecycle‑management platform for resources, applications, configurations, and policies across many Kubernetes clusters. The project is under CNCF Sandbox incubation.
History of Multi‑Cluster Management
Early federation efforts such as KubeFed v1 (Red Hat, Google) and KubeFed v2 (Red Hat, IBM) highlighted common requirements:
Geographic distribution across heterogeneous infrastructures.
Scalability limits of a single cluster (etcd latency, node count).
Disaster‑recovery and isolation, including multi‑tenant isolation via API Priority and Fairness.
OCM Core Functions and Architecture
OCM abstracts five essential capabilities for any multi‑cluster manager:
Definition of a managed cluster.
Selection of one or more clusters via placement policies.
Distribution of configurations or workloads to the selected clusters.
Governance of user access to clusters.
Deployment of management add‑ons to clusters.
OCM follows a hub‑agent model built on the following primitive APIs:
ManagedCluster API : defines managed clusters; OCM installs a Klusterlet agent in each cluster for registration and lifecycle handling.
Placement API : decides where configurations or workloads should be scheduled; results are stored in the PlacementDecision API.
ManifestWork API : describes the resources to be applied to a target cluster.
ManagedClusterSet API : groups clusters and defines access boundaries.
ManagedClusterAddon API : governs how add‑ons are deployed and communicate securely with the hub.
The architecture consists of three main components:
Registration : cluster registration, lifecycle, and add‑on management.
Placement : workload scheduling across clusters.
Work : resource distribution.
Developers and SREs can use these primitives to build custom multi‑cluster tools.
Key Advantages
Modular Design
OCM’s core services provide cluster‑metadata abstraction, while optional modules (e.g., cluster grouping, resource distribution) can be added or removed, enabling incremental adoption.
Broad Compatibility
OCM integrates third‑party solutions such as Helm Chart deployment and an Addon framework, allowing extensions like Submariner for multi‑cluster networking.
Upcoming Features
ManagedClusterAction : atomic commands issued to individual clusters.
ManagedClusterView : projection of resources from managed clusters into the hub for dynamic decision‑making.
Real‑World Deployments
Ant Group
Ant Group uses OCM to manage dozens of on‑premise and cloud clusters. Notable capabilities include:
Certificate‑free registration : a pull‑based model eliminates the need to store cluster certificates.
Automated cluster registration through OCM’s approval workflow.
Automated resource install/uninstall pipelines for both cluster‑scoped and namespace‑scoped resources.
During large‑scale events such as Double‑Eleven, OCM automatically provisions and tears down clusters.
Alibaba Cloud
Alibaba Cloud leverages OCM as a core dependency of KubeVela, a CNCF‑hosted cloud‑native application platform. KubeVela provides end‑to‑end application delivery, gray‑release, autoscaling, and observability across multiple clusters. OCM handles cluster registration, governance, and resource distribution, enabling seamless hybrid‑cloud deployments.
Key use cases include:
One‑click hybrid environment provisioning (public ACK cluster + on‑premise cluster).
Multi‑cluster micro‑service application delivery using OCM’s distribution policies.
Community and Technical References
Source code and issue tracking are hosted at: https://github.com/open-cluster-management-io Relevant Kubernetes enhancement proposals:
KEP‑2149 Cluster ID:
https://github.com/Kubernetes/enhancements/tree/master/keps/sig-multicluster/2149-clusteridKEP‑1645 Multi‑Cluster Services API (clusterset):
https://github.com/Kubernetes/enhancements/tree/master/keps/sig-multicluster/1645-multi-cluster-services-apiWork API:
https://github.com/Kubernetes-sigs/work-apiSigned-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Alibaba Cloud Native
We publish cloud-native tech news, curate in-depth content, host regular events and live streams, and share Alibaba product and user case studies. Join us to explore and share the cloud-native insights you need.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
