Artificial Intelligence 6 min read

How Kuaishou Boosted Ad Performance with Multimodal LLMs: COPE & LEARN Frameworks

This article reviews Kuaishou's two‑year exploration of large‑model techniques in advertising, detailing the challenges of content‑domain ad estimation, the use of multimodal and LLM technologies to harness full‑scope user behavior and external knowledge, and the COPE and LEARN frameworks that delivered measurable business gains.

DataFunSummit

Nov 9, 2025

How Kuaishou Boosted Ad Performance with Multimodal LLMs: COPE & LEARN Frameworks

Introduction

This article summarizes Kuaishou's exploratory work over the past two years on applying large‑model technology to advertising scenarios, explaining the data and model design motivations, the challenges in content‑domain ad estimation, and how multimodal and LLM techniques were used to improve the ad system, resulting in concrete business benefits. The key algorithms introduced are the COPE (Commodity‑Content Unified Representation) framework and the LEARN (LLM Knowledge Transfer) framework.

Challenges in Content‑Domain Ad Estimation

Kuaishou combines content and e‑commerce, offering various media such as images, short videos, live streams, product detail pages, and landing pages. User behavior is scattered across these heterogeneous contexts, making it sparse in any single scenario. The existing recommendation system relies on ID‑centric pipelines (Video ID, Item ID, Live ID, etc.), which are not interoperable, hindering cross‑domain interest modeling. Moreover, different content types have varying lifecycles, further complicating stable interest capture. Advertising data is especially sparse, and without external knowledge the system falls into an information‑cocoon, repeatedly reinforcing existing patterns.

Leveraging Full‑Scope Behavior: COPE Framework

To address cross‑media sparsity, Kuaishou first built a SPU (Standard Product Unit) ID system that aggregates items with identical attributes, providing a more stable identifier for cross‑scenario modeling. However, SPU IDs still lack semantic richness. The COPE framework extends this by compressing multimodal content (video, live, product pages) into a unified, robust representation. This unified embedding enriches item features, expands user behavior sequences, and reduces dependence on any single media’s interaction data, thereby alleviating insufficient feature learning.

LLM Knowledge Transfer: LEARN Framework

To break the information‑cocoon, Kuaishou leverages the open‑pretrained capabilities of large language models. By injecting world knowledge and strong reasoning abilities, the LEARN framework transfers external knowledge into the advertising model, enhancing its ability to generalize beyond the limited historical ad signals.

Business Impact

Integrating COPE and LEARN into the ad recommendation pipeline has yielded noticeable improvements in click‑through rates and conversion metrics, demonstrating the practical value of multimodal large‑model techniques in a real‑world advertising ecosystem.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Multimodal AI Advertising large language models Recommendation Systems product representation Knowledge Transfer

Written by

DataFunSummit

Official account of the DataFun community, dedicated to sharing big data and AI industry summit news and speaker talks, with regular downloadable resource packs.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.