CSCNN: Category‑Specific Convolutional Neural Network for Visual CTR Prediction in JD E‑commerce Advertising

This article presents CSCNN, a category‑specific convolutional neural network that integrates visual priors into click‑through‑rate (CTR) models for JD.com’s e‑commerce advertising, detailing its motivation, architecture, engineering optimizations, offline and online training strategies, and empirical performance gains on both public and industrial datasets.

DataFunTalk
DataFunTalk
DataFunTalk
CSCNN: Category‑Specific Convolutional Neural Network for Visual CTR Prediction in JD E‑commerce Advertising

JD.com’s search advertising platform relies heavily on CTR models to rank ads; with the massive influx of visual content, leveraging image information has become a new trend. The talk introduces CSCNN, a next‑generation ad‑ranking model that incorporates visual cues into CTR prediction.

The presentation first outlines the background of JD’s 9NAI platform, the challenges of optimizing eCPM in e‑commerce, and the four‑fold feature space (query, user, product, context) used in CTR modeling. It then discusses the limitations of traditional CNNs for this domain, such as weak supervision, overfitting on sparse features, and engineering bottlenecks.

To address these issues, the authors propose a multi‑modal feature pipeline: manual features, text features, user‑side interaction features, and image features. They highlight the need for visual priors—category information that can guide CNN learning—so that the network focuses on category‑relevant details and avoids irrelevant background noise.

CSCNN builds on a category‑specific attention mechanism similar to SENet. For each convolutional layer, channel‑wise and spatial‑wise attention modules receive both the feature map and a category embedding, producing refined feature maps that are biased toward the given product category. This design effectively turns the CNN into a category‑aware extractor.

Engineering solutions are described to mitigate training and serving costs: offline pre‑computation of image embeddings, aggregation of identical product requests, synchronized multi‑GPU updates, and a lookup‑table‑based online serving that reduces latency to sub‑20 ms on CPU.

Extensive experiments on a public Amazon dataset and JD’s massive industrial logs (hundreds of billions of samples, thousands of categories) demonstrate significant AUC improvements over baseline models, late‑fusion approaches, and other attention‑based methods. The CSCNN model has been deployed at scale, serving billions of users daily.

In conclusion, integrating category‑specific visual priors into CNNs substantially boosts CTR prediction performance in e‑commerce advertising, and the proposed system architecture enables practical, large‑scale deployment.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

machine learningDeep LearningCTR predictione-commerce advertisingcategory-specific CNNvisual modeling
DataFunTalk
Written by

DataFunTalk

Dedicated to sharing and discussing big data and AI technology applications, aiming to empower a million data scientists. Regularly hosts live tech talks and curates articles on big data, recommendation/search algorithms, advertising algorithms, NLP, intelligent risk control, autonomous driving, and machine learning/deep learning.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.