CoderRec: Latent Reasoning Boosts Sequential Recommendation
CoderRec, a new sequential recommendation framework jointly developed by Tencent Advertising Technology and Tsinghua University, combines domain‑specific latent reasoning with cross‑scale model collaboration to capture implicit user intent and fuse large‑language‑model semantics with traditional recommender signals, achieving state‑of‑the‑art performance on multiple Amazon datasets.
Background
Sequential recommendation is increasingly embedded in everyday digital experiences, predicting the next item a user may want based on historical clicks, views, and purchases. Recent attempts to incorporate large language models (LLMs) have shown promise but face two major challenges: the scarcity of standardized reasoning data and the loss of rich semantic information when compressing LLM outputs into discrete IDs.
Challenges in Existing Methods
Reasoning data scarcity : Unlike mathematical reasoning, user behavior logic is highly contextual, subjective, and difficult to formalize, making high‑quality reasoning chains hard to obtain.
Insufficient semantic utilization : Current pipelines often compress LLM‑derived semantic embeddings into discrete IDs (e.g., via RQVAE), discarding much of the original semantic richness and limiting downstream recommendation performance.
CoderRec Overview
To address these issues, Tencent Advertising Technology and Tsinghua University propose CoderRec , a sequential recommendation framework that integrates domain‑specific latent reasoning and cross‑scale model collaboration . The core innovations are:
Latent reasoning mechanism : Enables the model to capture implicit user intent without manual annotation by activating reasoning processes in a latent space.
Cross‑scale model collaboration : Bridges high‑dimensional LLM semantics with low‑dimensional recommender representations, allowing mutual knowledge transfer.
Two‑stage training and representation alignment : Aligns LLM semantic knowledge with recommender signals through a dedicated loss.
Cross‑Scale Model Collaboration
LLMs such as Llama‑3 8B (4096‑dimensional representations) and Qwen‑3 4B (2560‑dimensional) vastly exceed the dimensionality of typical recommender models (≤128). Direct linear mapping leads to representation collapse. CoderRec adopts RQVAE as a hierarchical quantization bridge, compressing LLM embeddings while preserving essential semantics. Unlike prior work that only compresses, CoderRec also reconstructs semantic IDs, enabling bidirectional information flow between large and small models.
To avoid conflicts where different items share similar early RQVAE dimensions, CoderRec combines raw item IDs with semantic IDs into a cross‑scale ID , jointly embedding both sources:
Item ID embedding : Standard embedding of each product ID.
Semantic ID embedding : Multi‑layer embedding table where each layer corresponds to an RQVAE codebook.
Fusion : A lightweight linear fusion layer merges the two embeddings into a unified item representation.
Domain‑Specific Latent Reasoning
Inspired by Quiet‑STaR, CoderRec introduces a domain‑specific latent reasoning mechanism. User interaction sequences are treated as sentences, and hidden “thought trajectories” are inferred to model the implicit decision logic behind item transitions. These latent trajectories are injected into the LLM via a special <think> token and trained with a parallel attention mask that restricts each thought token’s attention to its own trajectory and preceding items.
A reasoning fusion module learns to combine the LLM’s raw output h, latent thought representation l, and domain‑specific thought representation, with the fusion weights initialized to zero to ensure stable early‑stage training.
Training Strategy
Because of the representation gap between recommender and LLM components, training proceeds in two phases:
Pre‑training (warm‑up) : Train the recommender head on the downstream task to obtain a solid baseline.
Joint training : Simultaneously optimize the recommendation head (cross‑entropy loss) and the token‑prediction head (reconstructing semantic IDs via RQVAE), weighted by hyper‑parameters λ₁ and λ₂.
Experimental Results
Experiments on three Amazon sub‑datasets (Beauty, Sports & Outdoors, Musical Instruments) show that CoderRec consistently outperforms baselines such as SASRec, BERT4Rec, and recent LLM‑enhanced recommenders. The cross‑scale collaboration and latent reasoning each contribute significant gains, with latent reasoning improving both large‑scale and small‑scale models.
Conclusion
CoderRec is the first framework to embed latent reasoning into LLM‑based sequential recommendation, leveraging cross‑scale model collaboration to fuse semantic richness with domain‑specific signals. Extensive experiments validate its superiority, and future work will explore more efficient semantic alignment and extensions to multi‑intent or long‑term conversational scenarios.
Tencent Advertising Technology
Official hub of Tencent Advertising Technology, sharing the team's latest cutting-edge achievements and advertising technology applications.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
