Cloud Native 19 min read

Evolution of Service Mesh at Ant Financial: Design, Selection, and Implementation of SOFA Mesh

This article describes Ant Financial's decade‑long service‑oriented architecture evolution, the challenges of multi‑language and legacy systems, the evaluation of Istio, Linkerd and Conduit, and the design and deployment of the internally built SOFA Mesh using a Golang sidecar and EdgeSidecar to achieve cloud‑native, high‑availability service mesh capabilities.

AntTech
AntTech
AntTech
Evolution of Service Mesh at Ant Financial: Design, Selection, and Implementation of SOFA Mesh

Ant Financial has accumulated years of experience in service‑oriented architecture, supporting massive traffic peaks such as Double‑Eleven. In recent years, Service Mesh has become a hot topic, and the company needed a practical path to evolve from classic service‑oriented designs to a mesh‑based architecture.

The organization faced multi‑language integration issues (Java, NodeJS, C++, Python) and high maintenance costs for language‑specific middleware clients. Service Mesh was seen as a solution to move most functionality into a sidecar, reducing client duplication and improving stability.

When selecting a mesh framework, Ant Financial considered integration with existing architecture, production‑grade stability, and performance. Istio offered a complete control and data plane but suffered from a single‑point‑of‑failure Mixer and limited TPS (≈1,700). Linkerd was mature but required heavy JVM memory (≈100 MB) and lacked a control plane. Conduit, written in Rust, was less mature and had low adoption.

Because none of the existing solutions fully met the requirements, Ant Financial decided to develop its own mesh, SOFA Mesh, reusing proven concepts from Istio (Pilot and Auth) while moving Mixer into the sidecar and implementing the sidecar in Golang to achieve a low memory footprint (≈11 MB) and better cloud‑native compatibility.

SOFA Mesh adapts Pilot to work with the internal SOFARegistry service discovery, synchronizes only necessary data to reduce memory pressure, and introduces an EdgeSidecar role to handle cross‑environment service calls for isolated business units.

The deployment has already been applied to dozens of systems, solving multi‑language communication, legacy system integration, and decoupling infrastructure teams from business teams. Future plans include open‑sourcing SOFA Mesh and exploring additional mesh concepts such as Message Mesh and DB Mesh.

The article also includes a Q&A section from the GIAC Global Internet Architecture Conference, covering topics like high‑availability, security, multi‑version routing, and the importance of a control plane.

Finally, a recruitment notice invites Rust and middleware engineers to join Ant Financial's efforts, and community links are provided for further discussion.

Cloud Nativearchitecturemicroservicesservice meshAnt FinancialSOFA Mesh
AntTech
Written by

AntTech

Technology is the core driver of Ant's future creation.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.