Privacy-Preserving Federated Learning for Financial Risk Control Using Homomorphic Encryption
Tencent Shield‑Federated Computing enables banks to jointly train Gradient Boosted Decision Trees and Logistic Regression with external data owners by using homomorphic encryption to perform encrypted variable and split‑point searches, gradient aggregation, and model updates, delivering near‑centralized accuracy, up to 70 % speed gains, and full data confidentiality for financial risk control.
In the financial sector, banks and other institutions often need to collaborate with data‑rich parties (e.g., telecom operators, card networks) to build predictive models, but commercial and regulatory constraints prevent them from sharing raw data. This creates a large‑scale data‑sharing bottleneck.
Tencent Shield‑Federated Computing addresses this dilemma by reorganizing the computation processes of two widely used algorithms—Gradient Boosted Decision Trees (GBDT) and Logistic Regression (LR)—and applying homomorphic encryption to protect data during collaborative modeling.
The article uses the plot of the classic Hong Kong crime film "Line Man" as an analogy: a detective (the bank) and an informant (the data provider) must cooperate without fully trusting each other. The informant possesses valuable clues (e.g., drug‑purchase amounts), while the detective holds labels (criminal vs. law‑abiding). Homomorphic encryption allows the informant to encrypt his clues, perform computations on the ciphertext, and return results that the detective can decrypt, thereby revealing aggregate statistics without exposing raw data.
The core technique relies on the property that the sum of encrypted values, when decrypted, equals the sum of the original plaintexts. This enables the calculation of group‑wise crime ratios without revealing individual records.
Building on this primitive, the federated decision‑tree algorithm (federated GBDT) performs variable search and split‑point search in a privacy‑preserving manner. The process includes:
Variable search: testing each feature to find the one that best distinguishes criminals from law‑abiding individuals.
Split‑point search: determining the optimal threshold for each feature.
Greedy search: fixing the best variable and split at each tree level to reduce the combinatorial explosion.
To avoid excessive communication rounds, the system aggregates ciphertexts for all possible split points into a "homomorphic ciphertext gradient histogram," which the detective decrypts once to select the most discriminative feature and threshold.
Multiple trees can be combined to form ensemble models such as federated Random Forests or Gradient Boosted Trees, achieving performance comparable to centralized training while preserving data confidentiality.
The federated Logistic Regression adapts the homomorphic property further: by encrypting the gradient (the product of feature values and prediction error), the detective can send encrypted gradients back to the informant, who updates the model without learning the underlying labels.
These techniques have been integrated into Tencent Shield’s suite, delivering up to 70% speed improvements for Paillier‑based homomorphic encryption and eliminating the need for a trusted third party in logistic regression. The solution enables secure, efficient collaboration between financial institutions and external data owners, facilitating risk control, fraud detection, and credit scoring without exposing sensitive customer information.
Future work will continue to enhance the bridge between data islands, expanding the applicability of privacy‑preserving federated learning across more financial use cases.
Tencent Cloud Developer
Official Tencent Cloud community account that brings together developers, shares practical tech insights, and fosters an influential tech exchange community.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.