Cloud Native 15 min read

KubeAdmiral 1.0.0: A New Cloud‑Native Multi‑Cluster Orchestration Engine

Version 1.0.0 of KubeAdmiral, ByteDance’s open‑source multi‑cluster orchestration engine, introduces native Kubernetes API compatibility, advanced scheduling policies, fault‑tolerant migration, global status aggregation, and extensive hybrid‑cloud support, enabling seamless management of over 210 k machines across public and private clouds.

ByteDance Cloud Native
ByteDance Cloud Native
ByteDance Cloud Native
KubeAdmiral 1.0.0: A New Cloud‑Native Multi‑Cluster Orchestration Engine

KubeAdmiral v1.0.0 is the first stable release of ByteDance’s open‑source multi‑cluster management engine, originally incubated internally and open‑sourced in July 2023. It now powers more than 210,000 machines and 10 million Pods for large‑scale services such as Douyin and Toutiao.

Multi‑cluster business background and KubeAdmiral’s evolution at ByteDance

ByteDance operates thousands of clusters across private data centers and multiple public‑cloud providers, leading to resource fragmentation, isolated clusters per business line, and complex operational overhead.

To address these challenges, the team first built on KubeFed v2 in 2019 but encountered limitations such as low resource utilization, inflexible scaling, limited scheduling semantics, and high integration cost.

Project Overview

KubeAdmiral (named after “Admiral”) extends Kubernetes with powerful multi‑cluster orchestration capabilities. It supports public‑cloud clusters (Volcengine, Alibaba Cloud, Huawei Cloud), private‑cloud clusters, and user‑managed clusters.

Architecture

The control plane runs in a host cluster and consists of:

Fed ETCD : stores federated Kubernetes resources.

Fed Kube Apiserver : native API server for federated resources.

Fed Kube Controller Manager : runs selected native controllers (e.g., namespace, garbage‑collector).

KubeAdmiral Controller : custom component handling cluster management, resource scheduling, fault migration, and status aggregation.

The KubeAdmiral Controller includes several sub‑controllers:

Federated Cluster Controller : manages lifecycle of member clusters.

Federate Controller : creates FederatedObject for each native resource.

Scheduler : decides replica distribution across clusters.

Sync Controller : propagates federated objects to member clusters.

Status Controller : collects resource status from all clusters.

Core Features

Unified Multi‑Cluster Management

Supports public‑cloud, private‑cloud, and self‑managed Kubernetes clusters.

Multi‑Cluster Application Distribution

Compatible with native resources (Deployment, StatefulSet, ConfigMap), CRDs, and Helm charts.

Provides static‑weight, dynamic‑weight, and replica‑based distribution modes.

Cluster selection via explicit list, label selector, or affinity rules.

Resource follow‑up dispatch for ConfigMap, Secret, Service, Ingress, etc.

Configurable rescheduling policies and maximum cluster count.

Fault Migration

Automatic migration of unschedulable replicas.

Manual or automatic eviction of workloads from unhealthy or decommissioned clusters.

Cross‑Cluster Autoscaling

Supports native and custom HPA across clusters.

Global Status Aggregation

Centralized status collection via Status Controller.

Aggregated status presented on native resources for a unified view.

Real‑time monitoring and automated fault detection/recovery.

Rich Scheduling Capabilities

Pluggable scheduler architecture (Filter, Score, Select, Replica) similar to kube‑scheduler.

Built‑in plugins implement policies defined in

PropagationPolicy

objects.

Extensible via HTTP‑based external plugins.

Policy Configuration Examples

<code>apiVersion: core.kubeadmiral.io/v1alpha1
kind: PropagationPolicy
metadata:
  name: mypolicy
  namespace: default
spec:
  placement:
    - cluster: Cluster-01
      preferences:
        weight: 40
    - cluster: Cluster-02
      preferences:
        weight: 30
    - cluster: Cluster-03
      preferences:
        weight: 40
  clusterSelector:
    IPv6: "true"
  clusterAffinity:
    - matchExpressions:
        - key: region
          operator: In
          values:
            - beijing
  tolerations:
    - key: "key1"
      operator: "Equal"
      value: "value1"
      effect: "NoSchedule"
  schedulingMode: Divide
  reschedulePolicy:
    disableRescheduling: true
  maxClusters: 1
  disableFollowerScheduling: false</code>
<code>apiVersion: core.kubeadmiral.io/v1alpha1
kind: OverridePolicy
metadata:
  name: example
  namespace: default
spec:
  overrideRules:
    - targetClusters:
        clusters:
          - member1
          - member2
        clusterSelector:
          region: beijing
          az: zone1
      overriders:
        jsonpatch:
          - path: "/spec/template/spec/containers/0/image"
            operator: replace
            value: "nginx:test"
        image:
          - imagePath: "/spec/templates/0/container/image"
            operations:
              - imageComponent: Registry
                operator: addIfAbsent
                value: cluster.io</code>

Conclusion

KubeAdmiral v1.0.0 reflects a year of community and developer contributions, offering a production‑ready, cloud‑native multi‑cluster orchestration solution. It integrates tightly with Kubernetes APIs, supports extensive hybrid‑cloud scenarios, and provides extensible scheduling and fault‑tolerance mechanisms.

cloud-nativeKubernetesMulti-ClusterSchedulingKubeAdmiral
ByteDance Cloud Native
Written by

ByteDance Cloud Native

Sharing ByteDance's cloud-native technologies, technical practices, and developer events.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.