Artificial Intelligence 13 min read

Vertical Federated Learning: Characteristics, Research Directions, and Performance Optimization

This article introduces federated learning, traces its evolution, compares horizontal and vertical federated learning, analyzes the unique computational traits of vertical FL, and presents practical performance‑optimization techniques such as offline computation, sparse‑data handling, communication compression, and homomorphic encryption integration.

DataFunSummit

Aug 7, 2022

Vertical Federated Learning: Characteristics, Research Directions, and Performance Optimization

Federated learning (FL) is a privacy‑preserving machine learning paradigm that allows multiple participants to jointly train a model while keeping their raw data locally.

The concept originated around 2016 with Google’s proposal, but its roots lie in earlier privacy‑preserving data mining, analysis, and machine learning research.

FL can be categorized into horizontal, vertical, and transfer learning. Horizontal FL focuses on parties with the same feature space, while vertical FL targets parties that share the same sample IDs but possess different feature dimensions.

Vertical FL is widely needed in industries such as telecom, finance, and advertising, where combining complementary feature sets can improve risk‑control or recommendation models.

Research on vertical FL is relatively scarce; challenges include algorithmic “losslessness,” lack of provable security, and significant communication and computation overhead due to extensive ciphertext operations.

Typical vertical FL algorithms include logistic regression and XGBoost. Both require heavy encrypted computations, frequent inter‑party communication, and large data transfer.

Performance‑optimization practices presented include:

Offline computation: pre‑compute expensive operations (e.g., Paillier exponentiation) offline to accelerate online training.

Sparse data computation: exploit sparsity in high‑dimensional data with optimized sparse matrix multiplication and histogram techniques.

Communication compression: pack multiple plaintexts or ciphertexts (e.g., Paillier packing) to reduce transmission volume.

Fully homomorphic encryption (FHE): leverage SIMD packing, ciphertext bundling, and quantum‑resistant properties to speed up encrypted calculations.

Multi‑technology fusion: combine MPC primitives with machine‑learning operators and integrate local plaintext computation with HE/SS techniques, aiming for “no‑third‑party” solutions.

These optimizations collectively address the three main computational characteristics of vertical FL—intensive ciphertext computation, extensive inter‑party communication, and large communication payloads—resulting in noticeable speed‑ups and reduced bandwidth usage.

The article concludes with a brief speaker bio and a note that the presentation is part of DataFunTalk’s ongoing series on privacy‑computing technologies.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

performance optimization Federated Learning Privacy Computing Vertical FL

Written by

DataFunSummit

Official account of the DataFun community, dedicated to sharing big data and AI industry summit news and speaker talks, with regular downloadable resource packs.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.