Artificial Intelligence 15 min read

Advances in Click‑Through Rate Prediction: Deep Spatio‑Temporal Networks, Memory Networks, and Feature Expression Learning

This article reviews recent innovations in CTR prediction for an intelligent marketing platform, covering deep spatio‑temporal networks, deep memory networks, and a feature‑expression‑assisted learning framework, with system architecture details, experimental results, and references to KDD and IJCAI papers.

DataFunTalk

May 11, 2020

Advances in Click‑Through Rate Prediction: Deep Spatio‑Temporal Networks, Memory Networks, and Feature Expression Learning

The talk, presented by senior algorithm expert Xiu Wu from Alibaba, introduces the business background of an intelligent marketing platform that serves multiple traffic sources and ad types, emphasizing the need for high‑quality, efficient ad delivery and the challenges of the ad retrieval funnel, user profiling, low‑quality traffic, and anti‑fraud modeling.

It outlines the machine‑learning methodology required for the platform: extracting features, defining model structures, training, decision making, and continuous feedback loops, while stressing the three core requirements of accuracy (AUC), speed, and stability.

1. Deep Spatio‑Temporal Network (DSTN) – To better exploit user historical behavior, same‑screen competing ads, and ad‑ad relationships, three network designs are explored: simple sum‑pooling, attention‑based pooling, and interactive attention that jointly considers the target ad and auxiliary ads. Experiments on public and internal datasets show the interactive‑attention variant achieves the best performance.

2. Deep Memory Network (MA‑DNN) – Designed to address the high online inference cost of DSTN, this architecture introduces user memory vectors (clicked and non‑clicked) stored via a write‑control mechanism, enabling efficient online inference while still capturing historical behavior. Joint training with log‑loss and MSE loss yields consistent AUC gains.

3. Deep Matching, Correlation, and Prediction (DeepMCP) – A three‑part system comprising a main prediction network, a matching sub‑network (similar to DSSM) for user‑ad relevance, and a correlation sub‑network (similar to word2vec) for ad‑ad similarity. The architecture enhances both prediction accuracy and feature embedding quality, with negligible online overhead.

Experimental results demonstrate that each component (DSTN, MA‑DNN, DeepMCP) improves CTR prediction over baseline models such as LR, FM, DNN, Wide&Deep, and GRU, with DeepMCP achieving the highest overall gain.

The presentation concludes that the intelligent marketing platform continues to innovate in model techniques to balance effectiveness and efficiency, and invites interested engineers to join the exploration.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Advertising feature engineering Deep Learning CTR Prediction memory network spatio-temporal network

Written by

DataFunTalk

Dedicated to sharing and discussing big data and AI technology applications, aiming to empower a million data scientists. Regularly hosts live tech talks and curates articles on big data, recommendation/search algorithms, advertising algorithms, NLP, intelligent risk control, autonomous driving, and machine learning/deep learning.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.