How Ant Financial Scales Payments with Distributed Microservices and Database Sharding

This article explains Ant Financial's practical implementation of a distributed architecture—including micro‑service migration, modular development, database vertical and horizontal sharding, high‑availability mechanisms, task‑scheduling platforms, gray‑release strategies, and full‑link stress testing—to achieve reliable, scalable payment processing.

21CTO
21CTO
21CTO
How Ant Financial Scales Payments with Distributed Microservices and Database Sharding

1. Advantages and Concepts of Distributed Architecture

Traditional monolithic applications are fast to develop and deploy but suffer from slow compilation, startup, and code‑merge conflicts, making releases unreliable. As system complexity grows, monoliths become inefficient, prompting a shift to service‑oriented designs.

2. Microservices vs. Monolith

When complexity reaches a certain scale, splitting a monolith into services improves productivity. Microservices adapt to unpredictable business changes by evolving continuously.

3. Modular Development

Starting from a top‑level business design, modules are separated across presentation, logic, and data layers, ensuring continuity and data integrity during the transition from monolith to services.

4. Load‑Balancing Advantages of Microservices

Gateways replace traditional LVS/F5 as the access layer, providing lightweight load balancing, protocol conversion, and authentication. Service‑governance frameworks (e.g., Dubbo) handle registration, discovery, and isolation.

5. Solving Data‑Access Bottlenecks – Vertical Sharding

Databases are split by user, transaction, or accounting domains, reducing storage and access pressure; read/write separation can also be applied.

6. Solving Data‑Access Bottlenecks – Horizontal Sharding

Large tables are partitioned by transaction time. Complex cross‑shard queries are handled via Elasticsearch, multi‑step ID‑based retrieval, or distributed databases such as OceanBase.

7. Distributed TA System Practice

The traditional TA system suffers from serial clearing and low scalability. The distributed TA architecture introduces an access layer, business service layer, SOFAStack components, LAAS, and operational toolchains.

Key components include:

Access layer: protocol conversion, access control, file transfer, operations console.

Business service layer: core services like account, transaction, billing, clearing.

SOFAStack: open‑source microservice framework, distributed transaction, scheduling, messaging, data proxy, tracing.

Challenges addressed: efficient distributed clearing, fault tolerance, and correctness assurance.

8. Distributed Task Scheduling Platform

Supports custom sharding, pause/resume, forced cancellation, and retry mechanisms to ensure overall task success.

9. Gray Release Mechanism

Processes include beta release, group release, gray traffic, and full rollout. In clearing, gray releases can be applied per‑user shard to shorten rollout time.

10. Full‑Link Stress Testing

Production‑environment shadow tables are used for load testing without affecting live data, providing reliable results and table‑level isolation.

11. OceanBase High‑Availability Mechanism

Based on Paxos three‑replica deployment, offering strong consistency, continuous availability, automatic master‑slave failover, and resilience to single‑machine, data‑center, or city‑level failures.

Deployment options include same‑city three‑data‑center, two‑city three‑center, and same‑city active‑active disaster‑recovery architectures.

12. Summary

Ant Financial's technical middle‑platform demonstrates how distributed architecture, microservices, modular design, sharding strategies, robust scheduling, gray releases, and high‑availability databases collectively enable scalable, reliable payment processing.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Cloud NativeMicroservicesdistributed architecturehigh availabilitytask schedulingdatabase sharding
21CTO
Written by

21CTO

21CTO (21CTO.com) offers developers community, training, and services, making it your go‑to learning and service platform.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.