Cloud Native 16 min read

How We Built a High‑Performance OpenResty API Gateway on Kubernetes

This article details the design and implementation of a Kubernetes‑native API Gateway built with OpenResty, covering its architecture, controller logic, HTTP/gRPC load balancing, custom ingress handling, rate‑limiting, service proxying, and future plans for service‑mesh integration.

Jike Tech Team

Jan 16, 2020

How We Built a High‑Performance OpenResty API Gateway on Kubernetes

Introduction

OpenResty is a high‑performance web platform based on Nginx and Lua, ideal for building dynamic web applications, services, and gateways that handle massive concurrency and high scalability.

We built an API Gateway for Kubernetes clusters using OpenResty. It serves as the entry point for all user traffic, handling authentication, routing, rate limiting, IP black/white lists, load balancing, traffic monitoring, and logging.

Architecture

The gateway consists of two parts: a controller that watches Kubernetes resources and writes state to Redis, and an OpenResty instance that performs reverse proxying and load balancing.

API Gateway Controller

Because there is no mature Lua Kubernetes client, we implemented the controller in Go to sync cluster state to Redis.

The controller watches the following resources:

ConfigMaps in specific namespaces (used to extend Ingress resources)

Services and Endpoints – it stores newly added or removed Services and their Endpoints in Redis. For ExternalName Services, it resolves the DNS name (or uses the IP directly) and stores the result as an Endpoint.

Pod health changes – updated Endpoints are also written to Redis.

API Gateway Service

The gateway Service is exposed as a NodePort rather than a LoadBalancer to avoid automatic changes by the Alibaba Cloud controller manager.

External traffic first hits the Alibaba Cloud SLB, which terminates TLS and forwards HTTP requests to the NodePort. The gateway then routes based on host and path to the appropriate namespace Service.

Feature Overview

Load Balancing (HTTP/gRPC)

Initially the gateway accessed backend Pods via Service ClusterIP. Although iptables performed L4 load balancing, keepalive connections caused OpenResty to repeatedly talk to the same Pod, resulting in uneven distribution.

We first introduced Traefik as an ingress controller to achieve layer‑7 load balancing, then refactored the gateway to perform balanced upstream selection directly in OpenResty.

Implement a controller that stores each Service’s Endpoints in Redis.

Periodically pull the Endpoints from Redis and use balancer_by_lua_file to set the upstream peer dynamically.

Key Lua code snippets:

local host = ngx.var.host
local matching_route = routers.find_matching_route(host, ngx.var.uri)
if not matching_route then
    utils.exit_abnormally('no matching route: ' .. host , ngx.HTTP_NOT_FOUND)
end
local balancer = balancer_services.find_balancer(matching_route.service)
if balancer == nil then
    utils.exit_abnormally('cannot find balancer', ngx.HTTP_SERVICE_UNAVAILABLE)
end
ngx.ctx.balancer = balancer

In balancer_by_lua:

local picked = ngx.ctx.balancer:balance()
local ok, err = ngx_balancer.set_current_peer(picked)
if not ok then
    utils.exit_abnormally('failed to set current peer: ' .. err, ngx.HTTP_SERVICE_UNAVAILABLE)
end

After refactoring, we removed Traefik and achieved layer‑7 load balancing across Pods with OpenResty alone:

Performance impact: CPU usage dropped ~30 cores (~50%) and memory usage decreased by 7 GB (~70%).

For gRPC, we faced similar load‑balancing challenges because HTTP/2 reuses a single TCP connection. Existing solutions (headless Service with periodic DNS resolution, kuberesolver, Service Mesh) were either slow or complex. Since Nginx 1.13.10 supports gRPC proxying, we implemented gRPC load balancing by passing a custom header in gRPC metadata and using balancer_by_lua_file to select the upstream.

Supported balancing strategies are round‑robin and sticky (hash by user ID or IP).

Ingress Controller

Native Ingress lacks expressive power for authentication, timeouts, rate limiting, and IP whitelists. We defined a custom resource called Route (stored in a ConfigMap) to express these requirements.

Routes can include a placeholder mark in the host name, which maps to a specific namespace, enabling simple domain‑based environment switching.

hosts:
- foo{mark}.example.com
envs:
- mark: -prod
  namespace: foo
- mark: -test
  namespace: test

When a user accesses foo-prod.example.com, traffic is routed to the foo namespace; foo-test.example.com routes to the test namespace.

Routes, Services, and Paths can each define authentication, timeout, and rate‑limit policies, with lower‑level settings overriding higher‑level ones.

services:
- name: foo-service
  port: 8080
  access:
    auth_type: public
  paths:
  - access:
      auth_type: login
    timeout: 10
    uri: /headers
  - access:
      rate_limits:
      - burst: 0
        rate: 100
    timeout: 5
    uri: /

This configuration means foo-service:8080/ has a 5 s timeout, open access, and a limit of 100 requests per second, while /headers requires login and has a 10 s timeout.

Rate‑Limiting Module

We implement coarse‑grained rate limiting using a rate_limit object stored in Redis. Selectors can be ip, user, service, or path, and limits can be defined at the route, service, or path level.

{
    rate: <int>,  # requests per second
    burst: <int>, # allowed burst above rate
    selectors: [<str>]
}

Service Proxy

To expose internal Services to other VMs in the same VPC, we added a service‑proxy feature. An OpenResty upstream with balancer_by_lua_file selects the target Service based on a specially formatted URL.

upstream service_proxy_balancer {
    server 127.0.0.1;
    balancer_by_lua_file /path/to/balancer.lua;
}
location ~ ^/__(?<up_service>[a-z0-9\-]+)\.(?<up_namespace>[a-z0-9\-]+)\.(?<up_port>\d+)__(?<up_uri>.*) {
    access_by_lua_file /path/to/access.lua;
    proxy_pass http://service_proxy_balancer;
}

Clients can reach a Service with a URL like http://my.intranet/__service-name.namespace.port__/path, and the gateway extracts the Service details to proxy the request.

Future Plans

We are exploring Service Mesh (e.g., Istio) to offload traffic control, load balancing, observability, and fault‑injection to the data plane, allowing the gateway to focus on routing and authentication.

With Istio, canary or blue‑green deployments could be automated by adjusting DestinationRule subsets and VirtualService weights, using metrics to drive rollbacks.

We also consider replacing the Redis polling mechanism with an xDS‑style gRPC stream so the gateway receives configuration updates instantly.

Thank you for reading; we look forward to your feedback.

References

CoreDNS issue 2324

Link 2

kuberesolver

nginx gRPC module

gRPC metadata handling

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Kubernetes load balancing API Gateway Service Mesh Ingress OpenResty

Written by

Jike Tech Team

Article sharing by the Jike Tech Team

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.