How Alibaba’s Deep Learning Transformed CTR Prediction: From MLR to Multi‑Interest Networks
This article recounts Alibaba‑Mama researcher Jing Shi’s presentation on the evolution of deep learning for click‑through‑rate (CTR) estimation, covering the shift from handcrafted features and linear models to piecewise linear MLR, end‑to‑end neural networks, multi‑interest user modeling, and large‑scale distributed training challenges.
Jing Shi, a researcher at Alibaba‑Mama, introduced his background in computer vision, recommendation systems, and computational advertising, and outlined the agenda: the evolution of deep learning under internet‑scale data, its application in advertising and search, and future challenges.
Internet‑Scale Data Characteristics
Internet data is massive, high‑dimensional, and rich in internal relationships, exemplified by the many‑to‑many connections between users and items in e‑commerce.
Importance of CTR Prediction
CTR estimation is a core technology for advertising, recommendation, and search, directly influencing platform revenue and serving as a fertile ground for deep‑learning research.
Traditional CTR Methods
Two classic approaches dominate: manually engineered strong features combined with GBDT, and high‑dimensional sparse inputs fed into large‑scale logistic regression (a generalized linear model).
From Linear to Piecewise Linear Models
To capture non‑linearity, Alibaba‑Mama introduced a piecewise linear model (MLR) that partitions the feature space into many regions, each with its own linear predictor, smoothly combined to approximate complex functions.
End‑to‑End Deep Learning
Embedding vectors generated by MLR were fed into a multilayer perceptron, but without end‑to‑end training the gains were limited. Introducing end‑to‑end learning, where embeddings and the MLP are trained jointly, yielded significant CTR and GMV improvements, forming Alibaba‑Mama’s first deep learning network.
Modeling Multi‑Peak User Interests
Recognizing that user interests are multi‑modal, a deep network was proposed to extract relevant subsequences from behavior logs, forming multiple interest representations that better predict CTR for candidate items.
Challenges with Rich Media Features
Incorporating image features dramatically increases data volume, requiring novel distributed training architectures where parameter servers also host sub‑models to process images, reducing communication overhead.
Full‑Library Retrieval via Tree‑Structured Search
A hierarchical tree search reduces exhaustive evaluation of billions of items to a few dozen node evaluations, dramatically improving recall while maintaining deep‑learning‑driven relevance.
Future Directions
Open problems include handling missing labels for user experience metrics, improving recommendation diversity, and developing better evaluation beyond recall, such as new‑category recall and addressing recommendation self‑reinforcement loops.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Alibaba Cloud Developer
Alibaba's official tech channel, featuring all of its technology innovations.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
