How KubeGateway Solves kube‑apiserver Load‑Balancing and Traffic Governance
KubeGateway, a custom seven‑layer gateway built by ByteDance, eliminates kube‑apiserver load‑imbalance and adds comprehensive request governance—including routing, rate‑limiting, and degradation—by parsing HTTP2 traffic, supporting flexible policies, and transparently proxying requests without client changes.
KubeGateway Overview
KubeGateway is a seven‑layer gateway designed by ByteDance to address the load‑balancing and governance challenges of kube‑apiserver traffic in Kubernetes clusters. It provides full request governance such as routing, splitting, rate‑limiting, and degradation, significantly improving cluster availability.
Why Build KubeGateway
kube‑apiserver is the entry point of a Kubernetes cluster; its high availability determines the overall cluster reliability. Deploying multiple instances behind a traditional load balancer often leads to uneven request distribution and loss of client TLS certificates, preventing proper authentication.
Four‑layer load balancers operate at the transport layer (OSI 4) using NAT, while seven‑layer load balancers work at the application layer (OSI 7) based on request URLs.
Four‑layer load balancers cause request‑load imbalance, lack of flexible request governance, and other issues such as OOM on heavily loaded apiserver instances.
KubeGateway Architecture
Transparent to clients; no modification required.
Supports proxying multiple clusters distinguished by domain or virtual IP.
Load balancing at the HTTP request level, solving imbalance.
Pluggable load‑balancing strategies (Round Robin, Random) with easy extensibility.
Rich routing based on resource, verb, user, namespace, API group, etc.
Configuration managed via standard Kubernetes APIs with hot‑update support.
Provides common gateway capabilities: rate‑limiting, degradation, service discovery, graceful shutdown, upstream health checks.
Request Processing Flow
The proxy workflow consists of five steps: request parsing, route matching, user authentication, traffic governance, and reverse proxy.
Request Parsing
KubeGateway distinguishes between resource requests (e.g., CRUD on Pods) and non‑resource requests (e.g., /healthz, /metrics) and extracts routing fields from URLs and headers.
Route Matching
Extracted multi‑dimensional fields enable powerful routing rules, such as matching all list‑pod requests by verb and resource, or isolating core control‑plane components by user/group.
User Authentication
KubeGateway supports x509 client‑certificate authentication and Bearer‑Token authentication, forwarding the identified user information to kube‑apiserver via the Impersonate mechanism.
<code>Impersonate-User: client username</code> <code>Impersonate-Group: client group</code>Request Governance
Load Balancing : Selects an upstream server based on Round Robin or Random strategies; extensible to algorithms like Least Request.
Health Monitoring : Periodically probes /healthz; only healthy apiserver instances receive traffic.
Rate Limiting : Supports token‑bucket and max‑in‑flight request limits, allowing fine‑grained QPS control and protection against OOM.
Degradation : In case of apiserver or etcd failure, KubeGateway can reject all traffic to prevent cascading failures.
Reverse Proxy & Impersonation
After governance, KubeGateway forwards the request to the selected apiserver, adding headers
Impersonate-Userand
Impersonate-Groupso that the original user identity is preserved. The apiserver then validates the impersonated user’s permissions.
HTTP/2 Multiplexing
KubeGateway uses HTTP/2 by default, allowing up to 250 concurrent streams per connection, dramatically reducing the number of TCP connections to upstream apiservers.
Forward & Exec Requests
For requests requiring HTTP/1.1 (e.g., Forward, Exec), KubeGateway disables HTTP/2 and supports hijacking to handle protocols like SPDY or WebSocket.
Production Results
Performance testing shows KubeGateway adds only ~1 ms latency, handling over 200 k QPS across all ByteDance clusters, fully eliminating kube‑apiserver traffic imbalance and providing robust request governance.
Future Roadmap
Extend to full seven‑layer gateway features such as black‑/white‑listing and caching.
Improve observability for faster issue diagnosis.
Explore federation by aggregating multiple clusters into a single logical cluster.
KubeGateway is open‑source on GitHub, and the community is invited to contribute.
ByteDance Cloud Native
Sharing ByteDance's cloud-native technologies, technical practices, and developer events.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.