Artificial Intelligence 13 min read

Secure Data Hub: Alibaba's Marketing Privacy Computing Platform

Alibaba’s Secure Data Hub (SDH) is a privacy‑preserving data clean‑room platform that uses secure multi‑party computation and privacy‑enhancing machine learning to let advertisers, ad platforms, and auditors jointly analyze marketing data via a simple SQL API while keeping raw data encrypted, column‑level protected, and confined to each party’s private domain.

Alimama Tech
Alimama Tech
Alimama Tech
Secure Data Hub: Alibaba's Marketing Privacy Computing Platform

With increasing personal data protection regulations worldwide, data security and user privacy in online advertising have become critical. The Secure Data Hub (SDH) is Alibaba Mama's Data Clean Room solution that enables privacy‑preserving data fusion, computation, and joint modeling for advertisers, ad platforms, and third‑party auditors.

SDH leverages Secure Multi‑Party Computation (MPC) and Privacy‑Preserving Machine Learning (PPML) to allow data to be used safely throughout the advertising lifecycle—tracking, collection, activation, and measurement—while keeping raw data within each party’s private domain.

Core capabilities include:

Data usable but invisible: column‑level privacy protection via MPC metadata management; network traffic is encrypted (RSA, etc.).

Simple SQL API: hides distributed execution and cryptographic details, lowering development and ops cost.

General marketing analysis components: reusable modules for audience insight, attribution, and effectiveness measurement.

Lightweight cloud deployment: supports Alibaba Cloud, third‑party clouds, and private clouds.

Technical architecture consists of three layers: Console management, Agent proxy, and a Flink‑based compute engine. The console handles metadata, task scheduling, and permissions; the agent provides authentication and instance lifecycle APIs; the compute engine executes logical plans in a private environment, split into driver, scheduler, engine, and storage layers.

The system implements a SQL interface that automatically performs legality checks and plan rewriting based on data availability and visibility rules. During plan rewriting, join operations are transformed into RemoteJoinProbe and RemoteJoinBuild nodes that compute encrypted join keys.

Example of a split‑and‑rewrite SQL task:

INSERT INTO result
SELECT a.id
FROM a JOIN b
ON a.id = b.id;

For inequality conditions, SDH uses an expression execution engine that supports both plaintext and ciphertext operations. Example:

INSERT INTO result
SELECT a.id, a.time, a.value
FROM a JOIN b
ON a.id = b.id
AND a.value < b.value
AND 2 * a.value >= b.value;

Supported secure operators include:

Join operators (Shuffle Hash Join with ECDH‑encrypted keys and PSI for set intersection).

Inequality operators evaluated by the expression engine.

Plain‑ciphertext arithmetic and logical operators (AND, OR, <, <=, ==, !=, >=, >, +, -, *, /) built on ECDH, secret sharing, and homomorphic encryption.

Privacy protection covers metadata (table‑level access control), field‑level visibility (column‑level policies), and data protection (local data never leaves the private domain; network traffic is encrypted).

Business applications such as UniDesk demonstrate how SDH enables multi‑party audience insight, joint modeling, and effect measurement without exposing raw data, using MPC and federated learning.

In summary, SDH provides a privacy‑enhanced big‑data processing and machine‑learning platform that supports secure, scalable joint analytics for marketing scenarios, with open‑source federated learning solutions (EFLS) and ongoing roadmap for elastic deployment and higher‑complexity joint statistics.

SQLBig DataFederated Learningprivacy computingsecure multi-party computationData Clean Room
Alimama Tech
Written by

Alimama Tech

Official Alimama tech channel, showcasing all of Alimama's technical innovations.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.