Artificial Intelligence 15 min read

Advances in Click‑Through Rate Prediction: Deep Spatio‑Temporal Networks, Memory Networks, and Feature Expression Learning

This article reviews recent innovations in CTR prediction for an intelligent marketing platform, covering deep spatio‑temporal networks, deep memory networks, and a feature‑expression‑assisted learning framework, with system architecture details, experimental results, and references to KDD and IJCAI papers.

DataFunTalk
DataFunTalk
DataFunTalk
Advances in Click‑Through Rate Prediction: Deep Spatio‑Temporal Networks, Memory Networks, and Feature Expression Learning

The talk, presented by senior algorithm expert Xiu Wu from Alibaba, introduces the business background of an intelligent marketing platform that serves multiple traffic sources and ad types, emphasizing the need for high‑quality, efficient ad delivery and the challenges of the ad retrieval funnel, user profiling, low‑quality traffic, and anti‑fraud modeling.

It outlines the machine‑learning methodology required for the platform: extracting features, defining model structures, training, decision making, and continuous feedback loops, while stressing the three core requirements of accuracy (AUC), speed, and stability.

1. Deep Spatio‑Temporal Network (DSTN) – To better exploit user historical behavior, same‑screen competing ads, and ad‑ad relationships, three network designs are explored: simple sum‑pooling, attention‑based pooling, and interactive attention that jointly considers the target ad and auxiliary ads. Experiments on public and internal datasets show the interactive‑attention variant achieves the best performance.

2. Deep Memory Network (MA‑DNN) – Designed to address the high online inference cost of DSTN, this architecture introduces user memory vectors (clicked and non‑clicked) stored via a write‑control mechanism, enabling efficient online inference while still capturing historical behavior. Joint training with log‑loss and MSE loss yields consistent AUC gains.

3. Deep Matching, Correlation, and Prediction (DeepMCP) – A three‑part system comprising a main prediction network, a matching sub‑network (similar to DSSM) for user‑ad relevance, and a correlation sub‑network (similar to word2vec) for ad‑ad similarity. The architecture enhances both prediction accuracy and feature embedding quality, with negligible online overhead.

Experimental results demonstrate that each component (DSTN, MA‑DNN, DeepMCP) improves CTR prediction over baseline models such as LR, FM, DNN, Wide&Deep, and GRU, with DeepMCP achieving the highest overall gain.

The presentation concludes that the intelligent marketing platform continues to innovate in model techniques to balance effectiveness and efficiency, and invites interested engineers to join the exploration.

advertisingFeature Engineeringdeep learningCTR predictionMemory Networkspatio-temporal network
DataFunTalk
Written by

DataFunTalk

Dedicated to sharing and discussing big data and AI technology applications, aiming to empower a million data scientists. Regularly hosts live tech talks and curates articles on big data, recommendation/search algorithms, advertising algorithms, NLP, intelligent risk control, autonomous driving, and machine learning/deep learning.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.