Boosting Ads Revenue: LFM4Ads’ Full‑Representation Multi‑Granular Transfer Raises GMV 2.45%

Tencent's LFM4Ads introduces a full‑representation, multi‑granular knowledge transfer framework that moves user, item, and cross representations from a large foundation model to downstream tasks, achieving up to 2.45% platform GMV uplift across more than ten advertising scenarios.

Tencent Advertising Technology
Tencent Advertising Technology
Tencent Advertising Technology
Boosting Ads Revenue: LFM4Ads’ Full‑Representation Multi‑Granular Transfer Raises GMV 2.45%

Industry Pain Points and Solution Idea

Current recommendation systems follow a "foundation‑expert" paradigm, training a large foundation model on massive data and transferring only user representations to downstream expert models, which suffers from incomplete representation transfer, difficulty migrating cross representations, and limited downstream usage.

Incomplete transfer: only user representation (UR) is transferred, ignoring item representation (IR) and cross representation (CR).

Cross‑representation difficulty: CR links users and items at a fine granularity, making alignment with downstream samples hard.

Single downstream usage: upstream representations are merely added as extra features.

To overcome these limits, LFM4Ads proposes a full‑representation, multi‑granular transfer framework that moves UR, IR, and CR together and offers three downstream usage granularities.

Figure 1: Comparison with existing work
Figure 1: Comparison with existing work

Model Design and Representation Extraction

LFM4Ads adopts a three‑tower architecture: a user tower extracts UR, an item tower extracts IR, and a mixing tower combines them. In the mixing tower, UR and IR interact, pass through an MLP and task head, and the intermediate MLP layer output becomes the cross representation (CR). Two branches handle content and ad samples separately; only the ad branch’s CR is used for ad recommendation.

During training, UR, IR, and CR are stored for downstream use. UR/IR are coarse‑grained, summarizing comprehensive user/item features, while CR is fine‑grained, capturing user‑item interactions. Transferring all three enriches downstream knowledge, improves performance, simplifies model design, and reduces inference cost.

Figure 2: Model architecture
Figure 2: Model architecture

Enhancing Transferability of Cross Representations

CR is originally a sample‑level representation tied to both user and item, making alignment difficult and its quantity massive. LFM4Ads aggregates CR to user‑level and item‑level representations, reducing the number of CRs and enabling pre‑computation.

The aggregation uses a time‑aware exponential moving average algorithm, updating stored representations with a decaying weight based on the time elapsed since the last update, thus adapting to active and inactive users/items.

Multi‑Granular Downstream Usage

LFM4Ads defines three ways to exploit upstream representations in downstream tasks:

Feature‑level: Directly combine upstream representations with downstream features, using an adapter to bridge semantic gaps for CR.

Module‑level: Transfer the upstream interaction module and MLP as a homologous module in the downstream model, allowing joint fine‑tuning.

Model‑level: Compute cosine similarity between UR and IR as a simple recall model, optionally enhanced with an adapter and InfoNCE loss.

Figure 3: Feature/Module/Model usage
Figure 3: Feature/Module/Model usage

Upstream‑Downstream Workflow

After deploying LFM4Ads, the workflow proceeds as follows:

When a user and an ad reach the downstream system, both upstream and downstream start inference.

The storage module supplies the latest UR, IR, and CR to downstream.

New upstream representations replace old ones in storage and are aggregated.

User feedback on ads is used as labels to update both upstream and downstream parameters.

Figure 4: Workflow diagram
Figure 4: Workflow diagram

Online Deployment Scale

Daily, hundreds of billions of samples (80% content, 20% ads) are collected, each containing ~1,800 features and 50 behavior sequences, spanning months of history. The final LFM4Ads model is 4 TB, 48% larger than the biggest downstream model, handling 63 billion sparse features with 1.45 billion FLOPS and 500 k QPS.

Online Business Improvements

Since Q4 2024, LFM4Ads has been launched in over ten scenarios, raising platform GMV by 2.45%. Specific gains include:

Feature‑level: +0.42% pCTR in Moments, +2.53% pCVR in Moments, +0.70% recall in Video, +1.75% recall in Search, +0.76% pCTR in Ad Network, +0.93% pLTV in Internet Services.

Module‑level: Overall GMV increase of 1.88% across multiple pCTR/pCVR tasks.

Model‑level: In Video u2i recall A/B tests, CTR +1.83%, CTCTR +3.34%, dwell time +1.66%, fast‑scroll rate –0.36%.

Future Outlook

The team will continue scaling LFM4Ads, exploring larger models, more efficient transfer mechanisms, and cross‑modal data fusion, while collaborating openly with the industry to drive further commercial and societal impact.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

representation learninglarge-scale datafoundation modelKnowledge Transferads recommendation
Tencent Advertising Technology
Written by

Tencent Advertising Technology

Official hub of Tencent Advertising Technology, sharing the team's latest cutting-edge achievements and advertising technology applications.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.