Artificial Intelligence 16 min read

Exploring Large‑Model Applications in Advertising: The COPE and LEARN Frameworks at Kuaishou

This article reviews Kuaishou's two‑year exploration of large‑model and multimodal techniques for advertising, detailing the challenges of content‑domain ad estimation, the COPE unified product representation framework, and the LEARN LLM knowledge‑transfer approach, and reports the resulting business gains.

DataFunSummit

Dec 11, 2024

Exploring Large‑Model Applications in Advertising: The COPE and LEARN Frameworks at Kuaishou

Over the past two years Kuaishou has investigated the use of large‑model technologies in advertising scenarios, first outlining the challenges of content‑domain ad estimation—such as sparse user behavior across short videos, live streams, and product pages, and the limitations of ID‑centric pipelines—and then describing how multimodal and LLM techniques can enhance the ad system.

The COPE (Content‑Unified Product Embedding) framework addresses cross‑domain behavior by introducing a stable SPU ID system that aggregates semantically similar items, constructing a multimodal dataset covering short video, live, and product detail pages, and training a visual encoder with cross‑domain contrastive loss; later versions (COPE 1.1/Ampere) incorporate LLM‑extracted high‑signal text to improve multimodal representation.

COPE’s unified product embeddings are applied to recall and ranking stages, enriching dense features, enabling item‑to‑item retrieval, and extending user behavior sequences, which has yielded consistent online revenue improvements.

The LEARN framework focuses on transferring external knowledge from large language models to recommendation systems; it evolved from simple prompt‑based feature extraction to heavy fine‑tuning of LLMs, and finally to a knowledge‑transfer paradigm that combines offline dual‑tower training with online lightweight adaptation, using dense all‑action loss to capture stable long‑term interests.

Offline experiments on Kuaishou’s proprietary dataset and public benchmarks (e.g., Amazon Review) demonstrate significant gains in click‑through and conversion rates, especially for cold‑start and long‑tail items, outperforming ID‑based and ID‑plus‑text SOTA methods.

The article concludes with a summary of the achieved business impact and an invitation for collaboration on large‑model and AIGC applications in advertising, providing a contact email for interested parties.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

advertising recommendation cross‑domain behavior LLM knowledge transfer multimodal representation product embedding

Written by

DataFunSummit

Official account of the DataFun community, dedicated to sharing big data and AI industry summit news and speaker talks, with regular downloadable resource packs.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.