Artificial Intelligence 10 min read

How LLMs Transform Recommendation Systems: Insights from Kuaishou’s LERAN Paper

This article analyzes Kuaishou’s May 2024 paper on LLM‑driven recommendation, detailing its dual‑tower architecture, contrastive learning of user and item embeddings, and a CVR‑auxiliary task that together improve cold‑start handling and boost both offline and online AUC metrics.

NewBeeNLP

Jun 20, 2024

How LLMs Transform Recommendation Systems: Insights from Kuaishou’s LERAN Paper

TL;DR

The paper introduces LERAN, a dual‑tower framework that freezes a large language model (Baichuan2‑7B) to extract text embeddings, trains user and item embeddings via contrastive learning, and adds a CVR auxiliary task to align these embeddings with ranking objectives, yielding notable gains in cold‑start scenarios.

Background

Traditional recommender systems rely on ID embeddings, which ignore semantic information in item descriptions and struggle with sparse interaction data for new users or items. Large language models excel at capturing semantic knowledge, prompting the idea of using them as feature extractors to alleviate cold‑start and long‑tail problems.

Method

The proposed framework, LERAN (LLM‑driven Knowledge Adaptive Recommendation), adopts a dual‑tower architecture. Each tower consists of a Content Embedding Generation (CEG) module and a Preference Understanding (PCH) module.

CEG : Uses a frozen Baichuan2‑7B model to encode item text (title, category, brand, price, keywords, attributes). Token‑level hidden states from the final layer are averaged to form the item representation.

PCH : Aligns the LLM‑generated embeddings with the recommendation task via self‑supervised contrastive learning. User interaction sequences are fed into a causal‑attention transformer, producing user or item embeddings.

Three variants of the item tower were explored; Variant 1 (identical structure and weights to the user tower) performed best and was adopted.

Training proceeds by sampling a user’s historical behavior sequence, splitting it into two parts, and using one part as input to the user tower and the other to the item tower. Positive samples are the next interacted items; negatives are items from other users. The loss is Info‑NCE.

Experiments

Offline Evaluation

LLM‑derived embeddings outperform traditional ID embeddings and BERT‑generated features.

On the public MovieLens dataset, LERAN surpasses state‑of‑the‑art methods HSTU and SASRec.

Freezing the LLM and adding a transformer encoder yields better results than LoRA fine‑tuning.

Online Evaluation

An auxiliary CVR task is added beside the ranking model. The learned user and item embeddings are concatenated, passed through an MLP, and the intermediate vector (mid‑emb) is combined with existing features for the final ranking model. This double‑alignment—first in the LERAN pre‑training, then in the CVR task—ensures that the embeddings carry both semantic knowledge and recommendation‑specific signals.

Online A/B tests show increased profit and AUC, especially for long‑tail and cold‑start users/items, confirming that LLM‑infused semantics effectively mitigate sparsity issues.

Analysis

The improvements stem from two factors: (1) rich semantic information from the LLM provides an informative boost for users/items with limited interaction data; (2) the two‑stage alignment forces the embeddings to be useful for the downstream ranking objective, reducing noise that typically plagues raw pretrained features.

Overall, the study demonstrates a practical pathway—LLM‑to‑Rec—where large language models serve as feature extractors rather than generative recommenders, making them suitable for large‑scale industrial recommendation systems.

contrastive learning LLM cold start User Embedding Industrial Application Item Embedding

Written by

NewBeeNLP

Always insightful, always fun

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.