Long Sequence Modeling for Advertising Recommendation: TIN, Disentangled Side‑Info TIN, Stacked TIN, and Target‑aware SASRec
This article presents a comprehensive solution for heterogeneous long‑behavior sequence modeling in advertising recommendation, introducing the TIN backbone, Disentangled Side‑Info TIN, Stacked TIN, and Target‑aware SASRec, along with platform‑level optimizations that enable million‑scale sequences while delivering significant online performance gains.
Background : Sequential modeling of user behavior is a core research direction in recommendation systems. While recent works extend behavior sequences to thousands or millions of items, advertising scenarios suffer from sparse ad‑side actions even with multi‑year windows. To address this, a heterogeneous long‑behavior sequence mixing content‑domain commercial intents with ad‑domain actions is constructed.
Model Optimization – 2.1 Backbone TIN : Building on the DIN architecture, the TIN model adds Target‑aware Temporal Encoding (TTE) and Target‑aware Representation (TR) to explicitly capture temporal relations and cross‑feature interactions between historical actions and candidate ads, enhancing discriminative power at the representation layer.
2.2 Disentangled Side‑Info TIN (DI‑TIN) : To mitigate noise from heterogeneous side‑information (scene ID, action type, etc.), multiple TINs are instantiated, each selecting a subset of side‑info features for attention computation while still leveraging the fused representation of all side‑info, thereby reducing interference.
2.3 Stacked TIN Deep Sequence Model : A multi‑layer target‑attention architecture stacks Temporal Interest Modules, allowing each layer to increase the order of interaction (L‑th layer captures L‑th order cross‑features). This design enables high‑order feature interactions beyond the second‑order capability of traditional DIN/TIN.
2.4 Target‑aware SASRec for Coarse Ranking : To compress long sequences for efficient retrieval, category‑stratified sampling ensures each item category is represented. Multiple user‑tower embeddings are generated per category, and a target‑aware SASRec (Transformer‑based) captures temporal correlations, yielding embeddings with clear category clusters.
Platform Challenges – Long Sequence Modeling Evolution : Scaling to million‑level sequences imposes latency and storage burdens. The system adopts hard‑search, soft‑search, and TWIN two‑stage retrieval architectures, supporting flexible training and inference pipelines while keeping resource growth near zero.
Performance Optimizations : • Sparse embedding prefetch uses incremental swapping, LRU eviction, and high‑frequency residency. • TF Dataset processing is accelerated with GPU kernels, pinned memory, and pipelining. • Embedding lookup leverages shared‑memory to reduce global accesses. • Multi‑stream and CudaGraph reduce kernel launch overhead. • Flash‑Attention integration and Welford‑based LayerNorm accelerate Transformer components.
Online Effects : Deployed across major Tencent Ads traffic, the long‑sequence models increase GMV by 4.22% in video channels and 1.96% in friend‑circle feeds, while achieving 5× inference speedup and substantial resource savings.
References : The article cites recent KDD, WWW, ICDM, SIGIR, ICLR, and CIKM papers on recommendation, sequential modeling, and efficient Transformer implementations.
Tencent Advertising Technology
Official hub of Tencent Advertising Technology, sharing the team's latest cutting-edge achievements and advertising technology applications.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.