Backend Development 25 min read

How Apex API Gateway Revolutionizes Tencent’s Web Services with Dynamic Routing & Adaptive Rate Limiting

This article details the design and implementation of the Apex API gateway at Tencent, covering its architecture, dynamic routing, multi‑protocol support, distributed rate limiting, circuit breaking, service discovery, data masking, orchestration, testing automation, push services, observability, and future roadmap.

Tencent Qidian Tech Team

Dec 28, 2021

How Apex API Gateway Revolutionizes Tencent’s Web Services with Dynamic Routing & Adaptive Rate Limiting

Background

Tencent Qidian Web projects expose core business capabilities via dynamic pages, static pages, JSON APIs, and real‑time push APIs, requiring a unified, stable service layer. To isolate internal business systems from external calls, an API gateway is needed to provide common functions such as authentication, rate limiting, ACL, degradation, security, gray release, version control, and monitoring.

Previously Qidian used Nginx as the web entry layer, with many teams maintaining massive Nginx configuration files (up to 5,000 lines) that included extensions and Lua scripts. As the number of services grew, the approach showed several drawbacks:

Complex configuration files that are hard to maintain.

Business logic duplicated across many services.

Repeated implementation of WebSocket entry layers.

Various custom traffic‑gray‑release mechanisms.

Inability to enforce unified rate limiting, circuit breaking, or degradation.

Security risks from exposing arbitrary APIs.

Lack of full lifecycle management for APIs.

Therefore Qidian urgently needed an API gateway to unify web entry and abstract common logic.

API Gateway Architecture Design

Apex gateway uses Tencent’s Secure Tencent Gateway (STGW) as a seven‑layer proxy providing security functions such as DDoS and WAF. STGW offloads HTTPS/WSS, allowing Apex to focus on HTTP/WS capabilities. Apex supports HTTP, WebSocket, Socket.io, SSE for inbound traffic and can proxy to HTTP, tRPC, gRPC, or private protocols.

The gateway is split into a data plane (handling upstream traffic and reverse‑proxy) and a control plane (API management, statistics, dynamic configuration, monitoring, and operations).

The design leverages open‑source components and internal libraries, requiring only custom functionality in the middle layer.

Define API Description

APIs are described using Protobuf and consist of four parts: basic information, inbound request, reverse proxy, and outbound request. This structured description makes implementation clearer and facilitates plugin integration.

Plugin Mechanism

The gateway adopts an onion‑model AOP plugin system: requests pass from outer to inner layers, business logic processes, then responses travel back outward, allowing pre‑processing, core logic, and post‑processing stages.

Dynamic Routing

Instead of reloading Nginx configs, Apex implements hot‑reload routing via a configuration‑center subscription model. Nodes subscribe to route changes, rebuild new routes, and replace the old routing object in milliseconds, achieving “changing tires on a highway” without downtime.

Service Discovery

Selectors choose backend nodes using a Registry and Strategy component. Registry registers services (using internal Polaris or third‑party solutions) and supports watching for changes. Strategies include round‑robin, random, metadata filtering, and sticky sessions.

Multi‑Protocol Proxy

Apex abstracts a Proxy layer to support HTTP, gRPC, tRPC, and internal protocols. HTTP uses Go’s ReverseProxy, while gRPC/tRPC rely on Protobuf‑generated client and server stubs.

Distributed Rate Limiting

Apex implements token‑bucket and sliding‑window algorithms. Token‑bucket uses Redis+Lua with automatic fallback to local limiting when Redis is unavailable. Sliding‑window combines local counting with remote synchronization for high‑performance scenarios.

Circuit Breaking

Apex adopts Google SRE client‑side throttling, calculating a discard probability based on request and success counts, allowing adaptive breaking without a hard cutoff.

Cache Management

Apex provides HTTP‑style caching for GET requests with a two‑level (local + remote) cache, configurable per request or per parameter, supporting TTL, status‑code filters, and fallback on upstream failures.

Automated Testing

Traffic can be mirrored to an intelligent testing platform (Nemean) which automatically generates test cases from captured requests, eliminating manual test case maintenance.

Server Push

A unified push system connects long‑lived client connections to the gateway, stores connection info in Redis, and allows the push service to locate the correct gateway node. The system supports synchronous and asynchronous push modes, connection caching, and graceful degradation.

API Documentation Generation

API definitions in the gateway can automatically generate Swagger JSON, which is imported into a unified documentation platform (Tolstoy) and can also be exported for external consumption.

Custom Alerts

Users can configure alert policies to monitor API health metrics such as failure count, latency, error rate, and QPS.

Observability

Metrics are collected via Prometheus & Grafana, logs via ELK, and traces via TpsTelemetry, providing comprehensive visibility for operators.

UI Management Console

The console offers a user‑friendly interface for API creation, configuration, debugging, and real‑time traffic monitoring.

Performance Data

On a 4‑core, 8‑GB machine, Apex achieves 11 k req/s in HTTP proxy mode and 20 k req/s in tRPC mode. Daily traffic reaches billions of requests, with peak QPS exceeding 100 k and average QPS over 20 k, maintaining zero downtime since launch.

Future Roadmap

Custom conditional routing and gray release.

Protocol‑as‑API: mount RESTful interfaces described by Google API Protobuf.

Complete multi‑language SDKs with signing, error handling, and reporting.

Explore adaptive rate limiting and circuit breaking.

Business‑level resource isolation (gateway deployment, Redis usage).

Support both centralized and private deployments.

Expand internal and external open‑source contributions.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Microservices service discovery api-gateway rate limiting dynamic routing

Written by

Tencent Qidian Tech Team

Official account of Tencent Qidian R&D team, dedicated to sharing and discussing technology for enterprise SaaS scenarios.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.