Exploring Large‑Model Applications in Advertising: The COPE and LEARN Frameworks at Kuaishou
This article reviews Kuaishou's two‑year exploration of large‑model and multimodal techniques for advertising, detailing the challenges of content‑domain ad estimation, the COPE unified product representation framework, and the LEARN LLM knowledge‑transfer approach, and reports the resulting business gains.
Over the past two years Kuaishou has investigated the use of large‑model technologies in advertising scenarios, first outlining the challenges of content‑domain ad estimation—such as sparse user behavior across short videos, live streams, and product pages, and the limitations of ID‑centric pipelines—and then describing how multimodal and LLM techniques can enhance the ad system.
The COPE (Content‑Unified Product Embedding) framework addresses cross‑domain behavior by introducing a stable SPU ID system that aggregates semantically similar items, constructing a multimodal dataset covering short video, live, and product detail pages, and training a visual encoder with cross‑domain contrastive loss; later versions (COPE 1.1/Ampere) incorporate LLM‑extracted high‑signal text to improve multimodal representation.
COPE’s unified product embeddings are applied to recall and ranking stages, enriching dense features, enabling item‑to‑item retrieval, and extending user behavior sequences, which has yielded consistent online revenue improvements.
The LEARN framework focuses on transferring external knowledge from large language models to recommendation systems; it evolved from simple prompt‑based feature extraction to heavy fine‑tuning of LLMs, and finally to a knowledge‑transfer paradigm that combines offline dual‑tower training with online lightweight adaptation, using dense all‑action loss to capture stable long‑term interests.
Offline experiments on Kuaishou’s proprietary dataset and public benchmarks (e.g., Amazon Review) demonstrate significant gains in click‑through and conversion rates, especially for cold‑start and long‑tail items, outperforming ID‑based and ID‑plus‑text SOTA methods.
The article concludes with a summary of the achieved business impact and an invitation for collaboration on large‑model and AIGC applications in advertising, providing a contact email for interested parties.
DataFunSummit
Official account of the DataFun community, dedicated to sharing big data and AI industry summit news and speaker talks, with regular downloadable resource packs.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.