Designing a High‑Performance Microservice API Gateway: Routing, Load Balancing, and Resilience
This article outlines a comprehensive design for a microservice API gateway, covering functional aspects such as routing, load‑balancing algorithms, service aggregation, authentication, traffic control, circuit breaking, and caching, as well as non‑functional concerns like high performance, high availability, scalability, statelessness, and monitoring.
Gateway Functional Design
Routing
Typical services expose RESTful APIs, so the routing module forwards requests to target services based on host, URL, and other rules. Because routing rules often change, hot‑deployment is desirable. Two main implementation options are:
Database‑based : Store routing rules in a database; the gateway queries the DB at request time, optionally caching rules in Redis for performance. This requires synchronization logic between DB and cache and an admin UI.
Configuration‑file‑based : Load rules from configuration files at startup. To avoid restarts, a configuration server can provide real‑time updates, reducing development effort.
Load Balancing
Common algorithms include:
Random : Select a service randomly; may cause imbalance.
Weighted Random : Assign weights to services to influence selection.
Round‑Robin : Distribute requests sequentially.
Weighted Round‑Robin : Combine weights with round‑robin.
Least Connections : Choose the service with the fewest active connections.
Source‑IP Hash : Hash the client IP to consistently route the same client to the same service.
For microservices, Source‑IP Hash is preferred because it avoids session‑sharing issues, is simple to implement, and suits relatively stable service sets.
Service Aggregation
Two approaches:
GraphQL: Use a GraphQL server before the gateway, embed GraphQL in the gateway (requires restart), or add a post‑gateway aggregation layer.
Coding: Implement aggregation logic directly in the gateway, assembling responses from multiple services. This is chosen when GraphQL learning cost is high and aggregation volume is low.
Authentication & Authorization
Most systems adopt RBAC (Role‑Based Access Control). RBAC0 provides basic role‑to‑user and role‑to‑permission mapping and is sufficient for typical internet projects. Spring Security is selected as the Java authentication framework.
Overload Protection
Traffic Control
Rate‑limiting is suitable for microservices. Token‑bucket algorithms (including leaky‑bucket) are commonly used, often implemented via Nginx or dedicated libraries.
Circuit Breaking
When failure counts exceed a threshold, the circuit opens, returning errors immediately; after a cooldown period it enters half‑open, allowing limited traffic to test recovery.
Service Degradation
A blocking‑queue‑based scheme can throttle requests and reject excess traffic according to configurable degradation rules.
Cache
Use a centralized distributed cache (e.g., Redis) for all gateway caching needs. Typical workflow: lookup cache → if hit, return; if miss, forward to target service, cache the response, then return.
Handle cache‑related issues such as avalanche, breakdown, and penetration.
Service Retry
Retry requires configuration (which endpoints, retry count, timeouts) and execution logic that respects those settings. Dynamic configuration is recommended.
Logging
Follow project logging standards; access logs can initially be captured by the ingress layer (e.g., Nginx) and later customized if needed.
Management
Non‑core management features can be deferred.
Gateway Non‑Functional Design
High Performance
Thread‑per‑request models do not scale; adopt a Reactor model (e.g., Netty) for asynchronous, event‑driven processing.
High Availability
Include traffic control, circuit breaking, and degradation, plus graceful startup (slow start) and graceful shutdown for both services and the gateway itself.
Scalability
Implement global and business‑specific interceptors with priority ordering; allow dynamic enable/disable via API or configuration server.
Elasticity
Design the gateway as stateless; use token‑based authentication to decouple user state, enabling easy horizontal scaling.
Monitoring
Leverage existing APM tools such as SkyWalking or Pinpoint for service monitoring, and ELK stack for log aggregation and analysis.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
ITFLY8 Architecture Home
ITFLY8 Architecture Home - focused on architecture knowledge sharing and exchange, covering project management and product design. Includes large-scale distributed website architecture (high performance, high availability, caching, message queues...), design patterns, architecture patterns, big data, project management (SCRUM, PMP, Prince2), product design, and more.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
