Practices of Building a Cloud‑Native PaaS Platform for Financial Scenarios
This article details Ant Financial's cloud‑native PaaS platform construction, covering financial‑industry challenges, Kubernetes‑based extensions such as CafeDeployment, fine‑grained release strategies, lossless traffic draining, and the open‑source OpenKruise UnitedDeployment that together enable stable, scalable innovation for banking services.
From February 19 to 26, Ant Financial hosted a digital classroom series titled “Fighting the Pandemic Together, Technological Breakthroughs,” inviting senior experts to share practical experiences on cloud‑native, R&D efficiency, and databases, and to discuss PaaS implementation in financial scenarios, mobile elastic architecture, and OceanBase 2.2 features.
Yu Renjie, a SOFAStack product expert, presented the construction practice of a cloud‑native application PaaS platform, emphasizing its role in linking cloud‑native architectures with the rapid‑change demands of financial services.
The talk highlighted the challenges of adopting cloud‑native technologies in finance, noting the industry’s shift from Cloud‑Based to Cloud‑Ready and finally Cloud‑Native, and the need for containers, service mesh, and serverless to decouple business logic from complex infrastructure.
While Kubernetes provides a solid orchestration foundation, it alone does not constitute a full PaaS; additional capabilities are required at the application layer to manage the entire lifecycle, especially in production‑grade financial environments.
Ant Financial introduced a “three‑axe” change‑control principle—gray release, monitoring, and emergency handling—and described how their PaaS implements fine‑grained release strategies such as group, beta, blue‑green, and canary deployments to meet stringent risk‑control requirements.
Because native Kubernetes deployments lack sufficient control for gray or canary releases, the team created a custom CRD called CafeDeployment . This extension adds topology awareness and a “Cell” deployment unit model, enabling high‑availability, disaster‑recovery, and precise placement of pods across zones.
An example illustrated a staged group release of ten pods across two availability zones, where pods are incrementally rolled out in groups, each step pausing for verification before proceeding, thereby ensuring controlled, observable changes.
For lossless traffic draining during upgrades, the platform uses finalizers and a ReadinessGate flag; pods are marked unready, associated services withdraw traffic, the upgrade proceeds, and traffic is restored once the new version passes health checks.
The open‑source project OpenKruise, particularly its UnitedDeployment controller, serves as the community counterpart of CafeDeployment, offering multi‑unit pod management, replica preservation, and extended workload features beyond native Kubernetes capabilities.
Looking ahead, Ant Financial aims to commercialize and open these capabilities to more financial institutions, building multi‑cloud federation, unitized architectures, and hybrid‑cloud solutions that provide unified release control and disaster‑recovery across regions.
In summary, CAFE (Cloud Application Fabric Engine) is Ant Financial’s cloud‑native PaaS built on SOFAStack, tightly integrated with middleware, ServiceMesh, and Alibaba Cloud ACK, delivering comprehensive application lifecycle management, risk‑controlled change processes, and scalable hybrid‑cloud support for the financial sector.
AntTech
Technology is the core driver of Ant's future creation.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.