Cloud Native 23 min read

How Meituan’s Shepherd API Gateway Achieves High‑Performance, Cloud‑Native Scalability

This article examines the background, design principles, high‑availability techniques, extensibility, and future roadmap of Meituan’s internally built Shepherd API gateway, showing how it streamlines microservice API management, improves developer productivity, and evolves toward a cloud‑native architecture.

Meituan Technology Team
Meituan Technology Team
Meituan Technology Team
How Meituan’s Shepherd API Gateway Achieves High‑Performance, Cloud‑Native Scalability

Background

API gateway definition

An API gateway is a traffic entry point between external clients and internal micro‑services. It performs protocol conversion, authentication, rate limiting, parameter validation, monitoring and other cross‑cutting functions.

Motivation for Shepherd

Before Shepherd, each business line had to maintain a separate web application to expose HTTP APIs, resulting in low development efficiency and poor resource utilization. Shepherd provides a unified, high‑performance, highly available gateway that can be configured without writing code.

Architecture

Overall structure

The system consists of three modules:

Control plane – management console and monitoring center for API lifecycle and metric collection.

Configuration center – stores DSL‑based API definitions in the Lion configuration service and pushes hot‑reloaded configs to the data plane.

Data plane – receives requests after Nginx load balancing, executes built‑in and custom components, and forwards calls to backend RPC/HTTP/function services.

Overall architecture
Overall architecture

Control plane

The management console handles API creation, DSL generation, versioning, approval workflow and configuration distribution. The monitoring center aggregates request metrics and triggers alerts.

Control plane
Control plane

Configuration center

API definitions are written in a domain‑specific language (DSL) that describes:

Name, Group – identifier and logical grouping.

Request – domain, path, query and header parameters.

Response – result assembly, error handling, headers, cookies.

Filters, FilterConfigs – functional components (e.g., auth, rate‑limit) and their parameters.

Invokers – backend invocation rules for RPC, HTTP or function services.

Configuration DSL
Configuration DSL

Data plane

Routing is performed using two structures for low‑latency lookup:

A direct MAP for static paths (key = full host+path, value = API config).

A prefix‑tree for paths containing variables, enabling fast prefix matching without regular expressions.

Routing structures
Routing structures

After routing, a chain of functional components is executed, including tracing, real‑time monitoring, access logging, parameter validation, authentication, rate limiting, circuit breaking and gray release.

Functional components
Functional components

Protocol conversion uses JsonPath expressions to map HTTP request fields to backend service parameters and vice‑versa.

Protocol conversion
Protocol conversion

High‑availability design

Asynchronous processing

All request handling is fully asynchronous. Requests arrive on Jetty (later Netty) IO threads, are handed off to a business thread pool, and backend calls use asynchronous RPC/HTTP clients. This eliminates thread‑blocking bottlenecks. After tuning Nginx long‑connections and switching from Jetty to Netty, end‑to‑end QPS increased from ~2,000 to >15,000.

Async processing
Async processing

Service isolation

Isolation is achieved on two levels:

Cluster isolation – separate clusters per business line, optionally dedicated deployments for critical services.

Request‑level thread‑pool isolation – fast and slow pools; APIs whose processing time exceeds a configurable threshold are routed to the slow pool to protect the fast pool.

Thread pool isolation
Thread pool isolation

Stability mechanisms

Shepherd provides a set of safeguards:

Multi‑dimensional flow control (UUID, app, IP, cluster).

Request caching for idempotent, read‑heavy APIs.

Per‑API timeout management with fast‑fail handling.

Circuit‑breaker that trips on configurable failure rates and returns fallback values.

Stability features
Stability features

Gray release support

Traffic can be split at the API level, downstream service level, or both. Strategies include percentage‑based rollout and condition‑based routing.

Gray release
Gray release

Usability features

Automatic DSL generation

Developers provide service interface information (AppKey, service name, method). Shepherd fetches the latest JSON Schema from the service framework, generates mock data, and produces a DSL with parameter mappings, eliminating manual DSL authoring.

DSL generation
DSL generation

API creation acceleration

Quick‑create wizards support four API types (RPC, HTTP, SSO callback, Nest). Batch operations allow bulk updates of common configurations (filters, error codes, CORS). Import/export tools enable moving API definitions between environments.

Extensibility

Custom components

Through a plugin model, developers can load custom components (e.g., signature verification, custom result handling). The component implements getName() and invoke() methods and is registered via SPI.

Custom component example
Custom component example

Service orchestration

Shepherd integrates with the internal Pirate middleware to orchestrate multiple backend calls. Shepherd sends orchestration configuration to Pirate via RPC; Pirate executes the composed calls and returns aggregated results, keeping the gateway lightweight.

Service orchestration
Service orchestration

Future roadmap

Cloud‑native migration

Shepherd will be migrated to Meituan’s Serverless platform Nest, and core functionality will be extracted into an SDK. This reduces the WAR package size, improves security, and enables elastic scaling.

Cloud‑native architecture
Cloud‑native architecture

Static site hosting

Shepherd will offer a high‑availability static‑site hosting service with custom domain support, authentication, CI/CD integration and scalable storage.

Static site hosting
Static site hosting

Component marketplace

A marketplace will allow teams to publish reusable custom components, fostering an ecosystem and avoiding duplicate development.

Component marketplace
Component marketplace
Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Cloud NativeMicroserviceshigh availabilityapi-gatewaydeveloper productivityService Orchestration
Meituan Technology Team
Written by

Meituan Technology Team

Over 10,000 engineers powering China’s leading lifestyle services e‑commerce platform. Supporting hundreds of millions of consumers, millions of merchants across 2,000+ industries. This is the public channel for the tech teams behind Meituan, Dianping, Meituan Waimai, Meituan Select, and related services.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.