How to Deploy and Test a Multi‑Cluster Istio Service Mesh with Kind and MetalLB
This guide explains why multi‑cluster deployments are needed for high‑availability, describes Istio's flat and non‑flat network models with single or multiple control planes, and provides step‑by‑step scripts to create Kind clusters, install MetalLB, configure root CAs, deploy Istio, set up gateways, and verify regional load balancing and failover.
To achieve high concurrency and availability, enterprises often deploy applications across multiple regions, clusters, and even multi‑cloud or hybrid‑cloud environments, making multi‑cluster deployment and management a challenge that Istio addresses with several models.
Multi‑Cluster Models
Istio supports four multi‑cluster models based on network topology (flat vs. non‑flat) and control plane architecture (single vs. multiple control planes):
Flat network: all clusters share the same network, allowing direct service access without a gateway. Advantages: low latency; Disadvantages: complex networking, non‑overlapping IP ranges, security considerations.
Non‑flat network: clusters are isolated and require a gateway for cross‑cluster traffic. Advantages: higher security and simpler IP planning; Disadvantages: higher latency and limited gateway routing capabilities.
Single control plane: one logical control plane manages all clusters, simplifying configuration sharing but potentially impacting performance at large scale.
Multiple control planes: each cluster has its own control plane, improving performance and availability at the cost of increased operational complexity.
Overall, Istio currently supports these four models, with flat‑network single‑control‑plane being the simplest and non‑flat‑network multi‑control‑plane the most complex.
Flat Network Single Control Plane
Deploy the Istio control plane components in a primary cluster; all other clusters connect to this control plane for Service, Endpoint, and API configuration. The Istiod core component connects to each cluster’s kube‑apiserver, and all sidecars subscribe to the central control plane.
Flat networks require non‑overlapping Service IPs and Pod IPs across clusters to avoid discovery issues.
Non‑Flat Network Single Control Plane
When clusters are in separate networks, an east‑west gateway forwards cross‑cluster traffic. The control plane still connects to each cluster’s kube‑apiserver, and sidecars subscribe to the shared control plane.
Flat Network Multiple Control Planes
Each cluster runs its own Istio control plane but still discovers services in other clusters. Shared root CA certificates enable mTLS across clusters, and sidecars connect to their local control plane, improving reliability for large‑scale deployments.
Deploying multiple control planes increases resource usage and operational overhead.
Non‑Flat Network Multiple Control Planes
Similar to the flat‑network multi‑control‑plane model, but clusters remain isolated; east‑west gateways handle cross‑cluster traffic, and sidecars connect only to their local control plane.
Multi‑Cluster Installation
Select a model based on network topology and cluster size, then install Istio on each cluster. The example uses the non‑flat network multi‑control‑plane model with two Kind clusters (cluster1 on network1 and cluster2 on network2).
<code># Verify Docker and Kind versions
$ docker version
... (output omitted)
$ kind version
kind v0.20.0 go1.20.4 darwin/arm64</code>Clone the helper repository and run the cluster creation script:
<code>cd kind-create
bash ./create-cluster.sh</code>Install MetalLB to provide external IPs for Istio gateways:
<code># install-metallb.sh
... (script content) ...</code>Generate a shared root CA and per‑cluster intermediate certificates, labeling Istio namespaces with
topology.istio.io/network=network${i}:
<code># install-cacerts.sh (script omitted)</code>Install Istio on each cluster using a customized
cluster.yamlthat sets
networkand
clusterName:
<code>apiVersion: install.istio.io/v1alpha1
kind: IstioOperator
spec:
values:
global:
meshID: mesh{i}
multiCluster:
clusterName: cluster{i}
network: network{i}</code> <code>cd istio-create
bash ./install-istio.sh</code>Verify the three Istio components (istiod, istio‑ingressgateway, istio‑eastwestgateway) are running in each cluster and that MetalLB has assigned external IPs.
<code>$ kubectl get pods -n istio-system
NAME ...
$ kubectl get svc -n istio-system
NAME ... EXTERNAL-IP ...</code>Configure east‑west gateways for cross‑cluster traffic and expose services with a
Gatewayobject:
<code>apiVersion: networking.istio.io/v1beta1
kind: Gateway
metadata:
name: cross-network-gateway
spec:
selector:
istio: eastwestgateway
servers:
- port:
number: 15443
name: tls
protocol: TLS
tls:
mode: AUTO_PASSTHROUGH
hosts:
- "*.local"</code>Enable Istio to watch remote API servers by creating remote secrets:
<code>docker_ip=$(docker inspect -f '{{range.NetworkSettings.Networks}}{{.IPAddress}}{{end}}' "cluster${i}-control-plane")
istioctl create-remote-secret \
--context="cluster${i}" \
--server="https://${docker_ip}:6443" \
--name="cluster${i}" | \
kubectl apply --validate=false --context="cluster${j}" -f -</code>Multi‑Cluster Application Test
Deploy a sample
helloworldservice (v1 in cluster2, v2 in cluster1) and a
sleeppod to issue curl requests. Verify that requests are load‑balanced across both versions.
<code>cd testing
bash ./deploy-app.sh</code> <code># Verify pods
$ kubectl get pods -n sample -l app=helloworld -context=cluster1
helloworld-v2-... Running
$ kubectl get pods -n sample -l app=helloworld -context=cluster2
helloworld-v1-... Running</code> <code># Continuous curl from sleep pod
while true; do curl -s "helloworld.sample:5000/hello"; done</code>Responses alternate between v1 and v2, confirming successful multi‑cluster deployment.
Regional Load Balancing
Define regions, zones, and sub‑zones using Kubernetes labels (
topology.kubernetes.io/region,
topology.kubernetes.io/zone) and Istio’s custom label (
topology.istio.io/subzone). Create a dedicated gateway and virtual service for
helloworld:
<code># helloworld-gateway.yaml
apiVersion: networking.istio.io/v1beta1
kind: Gateway
metadata:
name: helloworld-gateway
spec:
selector:
istio: ingressgateway
servers:
- port:
number: 80
name: http
protocol: HTTP
hosts:
- "*"
---
apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:
name: helloworld
spec:
hosts:
- "*"
gateways:
- helloworld-gateway
http:
- match:
- uri:
exact: /hello
route:
- destination:
host: helloworld
port:
number: 5000</code>Deploy the gateway and virtual service in both clusters.
<code>kubectl apply -f helloworld-gateway.yaml -n sample --context=cluster1
kubectl apply -f helloworld-gateway.yaml -n sample --context=cluster2</code>Configure weighted traffic distribution with a
DestinationRulethat uses
localityLbSetting:
<code># locality-lb-weight.yaml
apiVersion: networking.istio.io/v1beta1
kind: DestinationRule
metadata:
name: helloworld
namespace: sample
spec:
host: helloworld.sample.svc.cluster.local
trafficPolicy:
connectionPool:
http:
maxRequestsPerConnection: 1
loadBalancer:
simple: ROUND_ROBIN
localityLbSetting:
enabled: true
distribute:
- from: region1/*
to:
"region1/*": 80
"region2/*": 20
- from: region2/*
to:
"region2/*": 80
"region1/*": 20
outlierDetection:
consecutive5xxErrors: 1
interval: 1s
baseEjectionTime: 1m</code>Apply the rule in both clusters and observe traffic weighted according to the defined percentages.
<code>kubectl apply -f locality-lb-weight.yaml -n sample --context=cluster1
kubectl apply -f locality-lb-weight.yaml -n sample --context=cluster2</code>Regional Failover
Configure failover using
localityLbSetting.failoverso that traffic from a failing region is routed to the other region:
<code># locality-lb-failover.yaml
apiVersion: networking.istio.io/v1beta1
kind: DestinationRule
metadata:
name: helloworld
namespace: sample
spec:
host: helloworld.sample.svc.cluster.local
trafficPolicy:
connectionPool:
http:
maxRequestsPerConnection: 1
loadBalancer:
simple: ROUND_ROBIN
localityLbSetting:
enabled: true
failover:
- from: region1
to: region2
- from: region2
to: region1
outlierDetection:
consecutive5xxErrors: 1
interval: 1s
baseEjectionTime: 1m</code>Deploy the failover rule, then simulate a failure by draining listeners in the region1 sidecar, causing traffic to shift to region2 after outlier detection triggers.
<code>$ kubectl --context=cluster1 exec -n sample -c istio-proxy $(kubectl get pod -n sample -l app=helloworld,version=v2 -o jsonpath='{.items[0].metadata.name}') -- curl -sSL -X POST 127.0.0.1:15000/drain_listeners</code>Subsequent requests return the v1 version from region2, demonstrating successful regional failover.
Ops Development Stories
Maintained by a like‑minded team, covering both operations and development. Topics span Linux ops, DevOps toolchain, Kubernetes containerization, monitoring, log collection, network security, and Python or Go development. Team members: Qiao Ke, wanger, Dong Ge, Su Xin, Hua Zai, Zheng Ge, Teacher Xia.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.