Meituan Serverless Platform: Architecture, Practices, and Optimization
Meituan’s Nest Serverless platform, built on native Kubernetes with Knative‑inspired components, delivers elastic scaling, rapid cold‑start reduction, multi‑region high availability, and integrated developer tools, enabling higher resource utilization, lower costs, and up to 40 % faster development across diverse business scenarios.
Serverless has become a hot topic in the industry, with major cloud providers and internet companies actively building Serverless products. This article shares Meituan's practical experience in building its Serverless platform, covering technology selection, detailed system design, stability optimization, ecosystem construction, and real‑world deployment results.
1. Background
The term “Serverless” was introduced in 2012 and gained wide recognition after AWS launched Lambda in 2014. Serverless abstracts away server management, allowing developers to focus solely on business logic while the platform handles provisioning, scaling, and fault tolerance.
Meituan started its Serverless effort in early 2019 under the internal project name “Nest”. The development has gone through three major stages: rapid MVP validation, core‑technology optimization for stability, and ecosystem enrichment.
2. Rapid MVP Validation
Key decisions included:
Technical selection: FaaS vs. BaaS vs. application‑level Serverless.
Infrastructure: native Kubernetes rather than the internal Hulk platform.
Programming language: Java (the most widely used language within Meituan).
The MVP delivered basic capabilities such as build, publish, elastic scaling, trigger integration, and function execution, and was quickly piloted in several business scenarios.
3. Architecture Design
Nest is built on native Kubernetes and incorporates ideas from Knative. Its core components are:
Event Gateway – receives traffic from external sources and routes it to function instances.
Elastic Scaling – calculates desired replica counts based on metrics and adjusts resources via Kubernetes.
Controller – implements custom resource definitions (CRDs) for managing functions.
Function Instance – runs user code inside a pod.
Governance Platform – provides UI/API for building, versioning, and publishing functions.
Figures in the original article illustrate the FaaS flow and the overall Nest architecture.
4. Process Design
The CI/CD lifecycle consists of four stages: Build (generate image or binary), Version (immutable artifact), Deploy (publish), and Scale (elastic adjustment). Unlike traditional deployments, Serverless hides machines from users; scaling and deployment happen automatically based on traffic.
5. Function Trigger & Execution
Triggers are handled through a multi‑layer routing mechanism (SET, lane, region) followed by version routing, supporting canary and blue‑green releases. Functions run inside Kubernetes pods; the platform ensures that internal Meituan services (OCTO, Celler, DB, etc.) are available to the function code.
6. Elastic Scaling
Scaling decisions consider three aspects: timing, algorithm, and speed. The algorithm uses the formula
desired_instances = concurrent_requests / per_instance_threshold. Advanced features include scaling to zero, minimum‑instance reservation, and an activation buffer to handle cold‑start traffic.
Optimizations to reduce frequent scaling and improve responsiveness include sliding‑window smoothing, delayed shrinkage, and QPS‑based policies.
7. Core‑Technology Optimizations
Three phases of cold‑start reduction were implemented:
Image startup optimization – reduced container I/O and agent overhead, cutting startup from 42 s to ~12 s.
Resource‑pool optimization – cached pre‑warmed instances, lowering startup to ~3 s.
Critical‑path optimization – parallel download/decompression (LZ4, Zstd) and pre‑loading of common libraries, achieving end‑to‑end startup of ~2 s (platform side) and sub‑second for user code.
8. High‑Availability Guarantees
Both platform and user functions are protected:
Architecture layer – master‑slave design for stateful services, multi‑region and multi‑cluster isolation.
Service layer – event gateway includes rate‑limiting and async processing.
Monitoring layer – comprehensive metrics, alerting, and fault‑injection drills.
Business layer – automatic instance isolation, multi‑zone disaster recovery, and real‑time health checks.
9. Container Stability
To avoid resource contention during scaling, Meituan introduced lightweight containers with sidecar agents, separating management tasks from user processes and trimming unnecessary agents.
10. Ecosystem Enrichment
Developer tools (CLI, WebIDE) accelerate the entire lifecycle. Integration with existing pipelines, OpenAPI exposure, and support for merged deployments (multiple functions sharing a pod) improve resource utilization, especially for low‑frequency workloads.
11. Deployment Scenarios & Benefits
Nest is used across BFF, CSR/SSR, admin consoles, scheduled jobs, and data‑processing pipelines. Reported benefits include 40‑50 % higher resource utilization for high‑frequency services, significant cost reduction for low‑frequency functions, and roughly a 40 % boost in development efficiency.
12. Future Plans
Upcoming work focuses on scenario‑based templates, Serverless‑ification of traditional Java micro‑services (via ServiceMesh integration), further cold‑start reduction (AppCDS, GraalVM), richer developer tooling, tighter ecosystem integration, and continued container lightweighting.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Meituan Technology Team
Over 10,000 engineers powering China’s leading lifestyle services e‑commerce platform. Supporting hundreds of millions of consumers, millions of merchants across 2,000+ industries. This is the public channel for the tech teams behind Meituan, Dianping, Meituan Waimai, Meituan Select, and related services.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
