Cloud Native 18 min read

Ant Group FaaS: Architecture, Performance Optimizations, and Security Practices

This article explains the concept of Function-as-a-Service (FaaS), its rise and suitable scenarios, then details Ant Group's serverless architecture, performance‑tuning techniques, security isolation mechanisms, developer experience, and future outlook integrating AI to create a new programming paradigm.

AntTech
AntTech
AntTech
Ant Group FaaS: Architecture, Performance Optimizations, and Security Practices

Overview – Function-as-a-Service (FaaS) is a cloud computing model that lets developers write and deploy functions without managing underlying infrastructure.

FaaS Rise – Traditional monolithic or micro‑service applications suffer from heavy middleware coupling, complex deployment, and costly capacity planning; FaaS addresses these issues by treating functions as the programming unit, reducing operational overhead and improving cost efficiency.

Typical Use Cases – Backend‑for‑frontend glue code, event‑driven workloads (e.g., video transcoding, file uploads), and middle‑platform services such as algorithm operators benefit from the short‑lived, isolated nature of functions.

Technical Challenges – Performance concerns (function call latency, cold‑start time, scaling latency) and security isolation (preventing container escape, protecting host resources) must be solved for production‑grade FaaS.

Ant FaaS Architecture – Built on three core principles: (1) traffic‑driven container creation rather than metric‑based scaling, (2) sub‑100 ms cold‑start targets, and (3) strong security isolation via runSC sandbox (NanoVisor). The system consists of a Function Gateway, Container Scheduling Engine, Function Runtime, and Function Container.

Component Details – The Function Gateway forwards requests and triggers container scheduling; the Scheduler allocates Pods from a pre‑warmed pool; the Runtime implements OCI‑compatible fast start; the Container runs user code inside a secure sandbox.

Performance Optimizations – Migrating the gateway to run on Envoy with Go cgo integration, redesigning the scheduler (HUSE) for sub‑50 ms dispatch, eliminating warm‑pool overhead, caching all non‑CPU resources, using a read‑only file system (ROFS), and adopting checkpoint‑restore instead of create‑start to achieve ~90 ms cold‑start.

Security Capabilities – Each function runs in an isolated runSC sandbox with ACL‑controlled virtual NICs, eBPF‑based network filtering, and seccomp restrictions. NanoVisor provides process‑level isolation and a lightweight VMM, ensuring both vertical and horizontal security.

Developer & Operations Experience – Function creation, coding, and deployment happen within seconds, eliminating traditional build‑packaging steps; the platform offers built‑in observability, monitoring, and alerting, delivering a true serverless, zero‑ops experience.

Conclusion & Outlook – Ant FaaS delivers lower memory footprint, faster startup, strong isolation, and seamless developer workflow. Future directions include sub‑10 ms cold‑starts via fork technology and integrating AI‑generated code (AIGC) to further boost development efficiency, envisioning a new era of FaaS+AI programming.

Extension to Alipay Cloud Development – The mature FaaS stack powers Alipay’s cloud development product, inviting developers to try the service via the official website and community channels.

FaaSperformancecloud-nativeserverlesssecurityAnt Group
AntTech
Written by

AntTech

Technology is the core driver of Ant's future creation.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.