How Alibaba Scales Millions of Apps with Serverless: Inside CSE’s Architecture
This article explores how Alibaba's Cloud Service Engine (CSE) overcomes the limitations of AWS Lambda by enabling rapid, millisecond‑level startup for legacy online services, delivering elastic scaling, cost‑per‑request billing, and seamless migration to a serverless architecture.
Serverless topics cover a wide range of lifecycle stages such as code management, testing, deployment, operations, and scaling. Alibaba’s Cloud Service Engine (CSE) is a middleware product designed to provide the advantages of AWS Lambda while solving the challenges faced by users of Lambda.
What is Serverless? AWS defines Serverless as a set of services that allow developers to run code without provisioning or managing servers, offering features like on‑demand scaling and pay‑per‑request billing.
While AWS Lambda provides a complete Serverless solution, it does not address how existing (stock) applications can migrate to a Serverless architecture. It is primarily suited for new applications developed as Functions.
Value of Serverless to Cloud Computing Serverless frees users from managing servers, eliminates capacity planning, enables virtually unlimited scaling by sharing a resource pool, and allows billing based on request count, execution time, and memory size.
CSE aims to provide a scalable solution similar to AWS Lambda but capable of smoothly migrating existing applications.
Challenges for Existing Online Services
Typical characteristics of legacy online applications include resource allocation measured in minutes and startup times exceeding ten minutes. Reducing startup time to the millisecond level would enable a new distributed architecture.
How to Achieve Millisecond‑Level Startup?
Applications typically initialize many components during startup. Alibaba’s internal SOA and micro‑service environment often leads to startup times of ten minutes or more due to loading shared SDKs.
Solution 1: Cold‑Start Resource Compression
L1 achieves high‑density deployment by pre‑starting multiple instances on a single physical machine, freezing containers so CPU usage drops to zero and RAM usage falls to 1/20 of the original, providing millisecond‑level elasticity.
L2 dumps the in‑memory state of a started application to disk, allowing rapid horizontal scaling by copying the snapshot to other machines, typically completing in about five seconds.
Combining L1 and L2 can start any application within 10‑50 ms under burst traffic.
Solution 2: Hot‑Copy Startup Acceleration
L1 uses a fork‑based seed process (fork2) that can specify a PID, enabling fast instance creation. pid_t fork2(pid_t pid); L2 allows a single snapshot to spawn multiple processes, establishing a one‑to‑many relationship.
Comparison of the Two Approaches
Solution 1: No UUID issue, but requires language‑specific VM customization; slightly higher cost.
Solution 2: Language‑agnostic and lower cost, but may inherit a UUID problem where forked instances share the same static UUID.
Overall, Solution 1 is broader in applicability, while Solution 2 is better suited for FaaS and NBF scenarios.
Comparison with AWS Lambda
Lambda requires applications to be developed as Functions and loads each Function dynamically. CSE, by contrast, shares the same instruction set across multiple instances and only loads the differences, making it compatible with mainstream stacks such as Spring Boot, PHP, Java, Python, and Node.js.
Theoretical Model
Serverless applications dynamically adjust the number of instances, allowing multiple services to share the same machine during off‑peak periods.
Quantitative Analysis
Serverless cost advantages can be combined with CPU‑share and mixed‑workload scheduling techniques to deliver superior overall cost efficiency.
CSE Code Samples
HSF Demo
package com.test.pandora.hsf;
import com.alibaba.boot.hsf.annotation.HSFProvider;
@HSFProvider(serviceInterface = HelloWorldService.class)
public class HelloWorldServiceImpl implements HelloWorldService {
@Override
public String sayHello(String name) {
return "hello : " + name;
}
}Spring Boot Demo
package com.example.java.gettingstarted;
import org.springframework.boot.SpringApplication;
import org.springframework.boot.autoconfigure.SpringBootApplication;
import org.springframework.web.bind.annotation.RequestMapping;
import org.springframework.web.bind.annotation.RestController;
@SpringBootApplication
@RestController
public class HelloworldApplication {
@RequestMapping("/")
public String home() {
return "Hello World!";
}
@RequestMapping("/health")
public String healthy() {
// Message body required though ignored
return "Still surviving.";
}
public static void main(String[] args) {
SpringApplication.run(HelloworldApplication.class, args);
}
}CSE Production Practices
In an e‑commerce scenario, Serverless migration reduced the number of machines from 11 to 2, with elastic scaling handling traffic spikes from thousands to over a hundred thousand requests, then scaling back down after the peak.
Other e‑commerce services saw similar reductions in machine count and dynamic scaling between 1 and 4 instances.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Alibaba Cloud Developer
Alibaba's official tech channel, featuring all of its technology innovations.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
