Understanding Kubernetes Federation: kubefed and Karmada Multi‑Cluster Management
This article explains why Kubernetes single‑cluster scalability is limited to about 5,000 nodes, introduces the concept of multi‑cluster federation, compares the legacy kubefed project with the actively maintained Karmada solution, and shows how policies and replica‑scheduling enable flexible cross‑AZ deployments and failover.
Kubernetes has long claimed that a single cluster can support up to 5,000 nodes, and there are no immediate plans to increase this limit; for clusters larger than 5,000 nodes the recommended approach is to use multi‑cluster federation.
In cloud environments, an Availability Zone (AZ) may contain tens of thousands of nodes, making it impractical to manage such scale with a single control plane, so administrators must operate multiple clusters and aggregate them.
Federation can be viewed as a higher‑level abstraction that "packs" several independent clusters into a single logical cluster, allowing users to interact with a unified API without worrying about the underlying cluster boundaries.
kubefed is the original Kubernetes federation project, hosted in the official repository for almost four years. Its architecture introduces three key concepts: Template (the base resource definition), Placement (which clusters the resource should be deployed to), and Overrides (cluster‑specific modifications). The control plane generates a FederatedDeployment for each target cluster.
Example FederatedDeployment with overrides:
kind: FederatedDeployment
...
spec:
...
overrides:
# Apply overrides to cluster1
- clusterName: cluster1
clusterOverrides:
# Set the replicas field to 5
- path: "/spec/replicas"
value: 5
# Set the image of the first container
- path: "/spec/template/spec/containers/0/image"
value: "nginx:1.17.0-alpine"
# Ensure the annotation "foo: bar" exists
- path: "/metadata/annotations"
op: "add"
value:
foo: bar
# Remove an annotation
- path: "/metadata/annotations/foo"
op: "remove"
# Add an argument "-q" at index 0 of the args list
- path: "/spec/template/spec/containers/0/args/0"
op: "add"
value: "-q"kubefed also provides a ReplicaSchedulingPreference resource to distribute replicas across clusters based on weight, enabling workload migration when a cluster becomes resource‑constrained.
Recent versions of kubefed have removed the federated Ingress feature; external DNS‑based solutions are now preferred for cross‑cluster service discovery.
Karmada is the actively maintained successor to kubefed. It retains all native Kubernetes resources and adds two new policy resources: PropagationPolicy and OverridePolicy . Its control plane consists of an API Server, Controller Manager, and Scheduler, similar to a standard Kubernetes control plane.
Resource flow in Karmada:
Templates such as Deployment, Service, ConfigMap are selected by a PropagationPolicy , which creates a ResourceBinding for each target cluster.
OverridePolicy modifies the bound resources (e.g., adding annotations) and stores the final definitions in a Work object.
The Work object is sent to the member cluster, where its native controller manager creates the actual workloads.
Example policies:
# propagationpolicy.yaml
apiVersion: policy.karmada.io/v1alpha1
kind: PropagationPolicy
metadata:
name: example-policy
spec:
resourceSelectors:
- apiVersion: apps/v1
kind: Deployment
name: nginx
placement:
clusterAffinity:
clusterNames:
- member1
# overridepolicy.yaml
apiVersion: policy.karmada.io/v1alpha1
kind: OverridePolicy
metadata:
name: example-override
namespace: default
spec:
resourceSelectors:
- apiVersion: apps/v1
kind: Deployment
name: nginx
overrideRules:
- targetCluster:
clusterNames:
- member1
overriders:
plaintext:
- path: "/metadata/annotations"
operator: add
value:
foo: barCompared with kubefed, Karmada does not create separate federated resource types; it works directly with native resources while applying policies, which simplifies the adoption of new Custom Resource Definitions.
Karmada also extends PropagationPolicy with a replicaScheduling field to control replica distribution and failover. Example configuration:
# duplicated.yaml
replicaScheduling:
replicaSchedulingType: Duplicated
# divided.yaml
replicaScheduling:
replicaDivisionPreference: Weighted
replicaSchedulingType: Divided
weightPreference:
staticWeightList:
- targetCluster:
clusterNames:
- member1
weight: 1
- targetCluster:
clusterNames:
- member2
weight: 2The scheduler in Karmada decides *where* to place workloads and *how many* replicas to assign, offering coarse‑grained cross‑cluster scheduling that satisfies most practical needs without aiming for a globally optimal solution.
Conclusion : Federation addresses two main challenges—single‑cluster scalability limits and cross‑AZ/region cluster management. As Kubernetes matures, multi‑cluster and federation tooling (including projects like Karmada) are expected to become more robust, providing reliable solutions for large‑scale, multi‑region deployments.
IT Services Circle
Delivering cutting-edge internet insights and practical learning resources. We're a passionate and principled IT media platform.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.