Artificial Intelligence 14 min read

Model Compression and Feature Optimization for Large-Scale CTR Prediction in Advertising

Alibaba‑Mama’s advertising team shrank multi‑terabyte CTR models to just tens of gigabytes by applying row‑dimension embedding compression, multi‑hash embeddings, graph‑based relationship networks, PCF‑GNN pre‑training, and droprank feature selection, preserving accuracy while halving training time, doubling online QPS, and retiring hundreds of servers.

Alimama Tech

Jun 2, 2021

Model Compression and Feature Optimization for Large-Scale CTR Prediction in Advertising

With the emergence of billion‑parameter language models such as GPT‑3, the “brute‑force” approach of scaling model size has also become the dominant paradigm for click‑through‑rate (CTR) prediction in search, recommendation, and advertising. However, such massive models impose huge storage and compute demands, which are increasingly unsustainable as available compute resources plateau.

The Alibaba‑Mama advertising team systematically reduced the size of their CTR models while preserving prediction accuracy. The original models, which occupied several terabytes, were compressed to a few dozen gigabytes, achieving a “small‑but‑powerful” solution.

Two major optimization paths are identified: feature optimization and model‑structure optimization. Feature optimization includes enriching multimodal features, upgrading high‑order features, and introducing dynamic features. Model‑structure optimization involves Transformer‑based sequence modeling and graph‑neural‑network (GNN)‑based architectures. Over years of hardware growth, the CTR models grew both wider and deeper, eventually reaching multi‑terabyte scales.

To make further progress under limited resources, the team focused on compressing the embedding layer, which stores the majority of parameters. Three compression directions are explored:

Row‑dimension (feature‑space) reduction

Column‑dimension (embedding‑vector) reduction

Value‑precision quantization (e.g., FP16/Int8)

The article concentrates on row‑dimension compression, which can be performed online during training and yields orders‑of‑magnitude reduction without loss of accuracy.

Key practical techniques applied in the production environment include:

Relationship Network : Replaces implicit ID‑type cross features with a graph‑based network that models feature interactions using a self‑attention‑like mechanism.

Graph‑based Pre‑training (PCF‑GNN) : Represents features as nodes and interaction statistics as edges, learning explicit cross‑semantic embeddings via edge‑weight prediction.

Multi‑Hash Embedding : Uses multiple hash functions to compress core ID features, achieving near‑collision‑free embeddings while keeping model size in the tens of gigabytes.

Droprank Feature Selection : Integrates dropout‑based feature ranking into model training, enabling simultaneous optimization of feature selection and model performance. An enhanced version (FSCD) incorporates system‑resource priors for balanced efficiency and effectiveness.

Combined with column‑dimension upgrades, heterogeneous‑compute optimizations, and incremental feature pipelines, the CTR model size was reduced from terabyte‑level to dozens of gigabytes. This resulted in a 50% reduction in training time, a 100% increase in online QPS, and the decommissioning of hundreds of machines.

In summary, the systematic engineering practice demonstrates that “small‑and‑beautiful” models are feasible for large‑scale advertising CTR prediction, provided that resource‑aware feature and model compression strategies are employed.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

model compression embedding reduction feature selection Large-scale ML

Written by

Alimama Tech

Official Alimama tech channel, showcasing all of Alimama's technical innovations.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.