Advances in Click‑Through Rate (CTR) Modeling: Overview of Recent SIGIR Papers and Optimization Paths
The article reviews recent Alibaba Mama advances in click‑through‑rate modeling, classifying optimizations across the three‑layer CTR architecture and highlighting three SIGIR papers—GIN’s graph‑based user intent modeling, PCF’s pre‑trained GNN for explicit cross‑feature semantics, and FSCD’s compute‑factor‑guided automatic feature selection—each boosting prediction accuracy and system efficiency.
Click‑through rate (CTR) prediction plays a critical role in search, recommendation, and advertising systems. With the rapid development of deep learning, a multitude of new CTR modeling approaches have emerged in both academia and industry. This article classifies optimization directions according to the three‑layer structure of CTR models and highlights several interesting works from Alibaba Mama’s search advertising team.
CTR model architecture : The model consists of three layers – (1) Embedding Layers that map high‑dimensional categorical features into low‑dimensional vectors, (2) Hidden Layers that provide strong non‑linear fitting capability, and (3) Output Layers that express the specific prediction target.
Different layers admit distinct optimization paths. The figure below (original image) shows a high‑level taxonomy and lists classic innovations. The three SIGIR papers discussed focus on the Hidden Layers and Embedding Layers.
Paper 1 – GIN (SIGIR 2019) : Introduces a Graph Intention Network that integrates graph learning with CTR prediction, enabling richer user interest exploration and expansion. Deployed in Alibaba’s sponsored search, it yields noticeable business improvements.
Paper 2 – PCF (SIGIR 2021) : Proposes a pre‑trained Graph Neural Network for explicit semantic cross‑feature learning, addressing the challenges of feature interaction modeling, model compression, and generalization.
Paper 3 – FSCD (SIGIR 2021) : Presents a learnable feature‑selection method based on a compute‑factor prior, deriving a new coarse‑ranking model that better balances effectiveness and efficiency.
GIN – User Behavior Spatio‑Temporal Modeling
In search scenarios, users express intent through queries, but short queries often fail to capture the full intent. Implicit behavioral feedback (both private and public) provides richer signals. Alibaba Mama combines sequence learning (private behavior) and graph learning (public behavior) to model these interactions. The GIN architecture expands each historical behavior item with graph‑based topological connections and applies multi‑layer graph convolutions to enhance interest representation. The model has been launched at scale in the “Direct Train” advertising product.
PCF – Explicit Cross‑Feature Semantic Modeling
Cross features are vital for CTR models and can be modeled implicitly (embedding of co‑occurring IDs) or explicitly (historical statistics such as 14‑day CTR). Most existing methods focus on implicit modeling; explicit signals are often ignored despite their strong predictive power. Directly using raw statistics faces two challenges: poor algorithmic generalization and huge storage overhead. PCF‑GNN treats features as graph nodes and cross‑feature statistics as edges, learning edge weights via link prediction. Experiments on internal and public datasets show substantial storage compression and improved generalization.
FSCD – Automatic Feature Selection for Coarse‑Ranking
Large‑scale search, recommendation, and advertising systems typically use a multi‑stage cascade (recall → coarse‑ranking → fine‑ranking → re‑ranking). Coarse‑ranking must provide a unified scoring rule for upstream recall while easing the load for downstream fine‑ranking. Traditional coarse‑ranking models rely on a representation‑focused (RF) vector‑dot architecture, which is fast but less accurate than interaction‑focused (IF) fine‑ranking models. To bridge this gap, FSCD reuses fine‑ranking data pipelines and learns a feature‑selection mask in the Embedding layer, producing an IF‑style coarse‑ranking model. By incorporating a compute‑factor prior, the method balances efficiency and effectiveness, achieving significant gains when deployed in Alibaba Mama’s search advertising.
The authors also note two practical reasons for focusing on feature selection: (1) the majority of parameters in large sparse models reside in the Embedding layer, and (2) feature computation consumes a large portion of online inference resources, especially with widespread GPU usage.
Summary and Outlook
Over the past year, Alibaba Mama’s algorithm team has continuously iterated on Hidden‑Layer feature‑interaction modeling and Embedding‑Layer feature‑selection, supporting rapid business growth while solidifying the research contributions. The article serves as an appetizer; interested readers are encouraged to consult the three linked SIGIR papers for deeper details. Future work on Output‑Layer innovations will be shared later.
Follow “Alibaba Mama Technology” and reply with 【SIGIR】 to receive download links for all three papers.
Alimama Tech
Official Alimama tech channel, showcasing all of Alimama's technical innovations.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.