Shared Machine Learning: Tackling Data Islands with Trusted Execution Environments and Multi‑Party Computation
The article explains how data islands and privacy concerns hinder AI development and describes Ant Financial's shared machine learning approach, which combines Trusted Execution Environments (TEE) and Multi‑Party Computation (MPC) to enable secure, privacy‑preserving data sharing and collaborative model training across organizations.
In the era of big data, the quality and quantity of data are crucial for machine learning models, but data sharing often leads to privacy leakage and data misuse, creating isolated data islands that limit AI progress.
Two main technical routes address these challenges: Trusted Execution Environments (TEE) such as Intel SGX, AMD SEV, and ARM TrustZone, which provide hardware‑based secure enclaves, and Multi‑Party Computation (MPC), which relies on cryptographic techniques like garbled circuits, secret sharing, and homomorphic encryption.
Ant Financial proposes a "Shared Machine Learning" paradigm that integrates both TEE and MPC to enable secure, privacy‑preserving collaborative learning across multiple parties, especially in the financial sector.
TEE‑based shared learning uses Intel SGX to create enclaves where encrypted data can be processed without exposing raw data. A cluster‑management framework registers enclaves, synchronizes keys via remote attestation, and provides load balancing, fault‑tolerance, and dynamic scaling for online prediction services.
The framework supports algorithms such as LR, GBDT, XGBoost, and can be extended to other models. It also addresses SGX’s 128 MB memory limitation through algorithmic optimizations and distributed processing.
MPC‑based shared learning is organized into three layers: a security‑technology layer (secret sharing, homomorphic encryption, garbled circuits, differential privacy, etc.), a basic‑operator layer (secure intersection, matrix operations, sigmoid/ReLU, etc.), and a secure‑ML‑algorithm layer (implementations of LR, GBDT, GNN, etc.).
The MPC workflow involves downloading encrypted data, uploading it to cloud storage, and executing distributed training where workers exchange encrypted intermediate results under secure protocols, ensuring data never leaves its domain.
Compared with federated learning, shared learning supports both TEE‑based centralized and MPC‑based decentralized approaches, accommodates heterogeneous participant roles, and covers a broader range of scenarios.
Future outlook emphasizes that while shared machine learning shows promise for secure AI in finance, many techniques remain immature, and ongoing research aims to improve performance, algorithm diversity, and real‑world deployment.
AntTech
Technology is the core driver of Ant's future creation.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.