5 real ways to make money online in 2026

22 min read

Boosting Guaranteed Display Ads with Large‑Scale Personalized Allocation and Real‑Time Pacing

This article presents a large‑scale distributed algorithm for guaranteed‑delivery advertising that incorporates user‑level personalization, solves a bipartite matching problem with complex constraints, and introduces a real‑time pacing module to smooth delivery while maximizing platform revenue and advertiser satisfaction.

Alibaba Cloud Developer

Dec 25, 2019

Boosting Guaranteed Display Ads with Large‑Scale Personalized Allocation and Real‑Time Pacing

1. Introduction

Guaranteed Delivery (GD) ads are a common brand‑display buying method. Existing technical solutions usually abstract the problem at the audience granularity, which ignores differences in user behavior within the same audience and cannot precisely control user‑level constraints.

Academic research typically models the GD allocation as a bipartite matching problem between contracts and supply, but current strategies remain at audience and tag granularity, requiring orthogonal audience and tag partitions and suffering from several limitations.

Allocating only at the audience level fails to match personalized user behavior to the correct ad, reducing advertiser ROI and platform revenue. Advertisers also impose complex delivery controls, such as per‑user frequency caps, making traditional audience‑level allocation inefficient for modern contract advertising products.

We propose a large‑scale distributed contract‑ad allocation algorithm that introduces user‑personalized delivery metrics, considers user interaction behavior, and performs allocation at the user granularity. The algorithm handles complex constraints like ad priority, frequency caps, and slot capacity, and includes a real‑time budget‑smoothing strategy to further optimize cost‑per‑click (CPC). Our system currently serves billions of offline computation tasks for Alibaba’s brand display ads and is validated with both offline and online experiments.

2. Problem Definition

The allocation problem can be represented as a bipartite graph between contracts (j) and users (k). Let

denote the traffic of targeted audience i, and

the contract‑side traffic constraint. The goal is to determine the proportion of traffic from audience i allocated to contract j.

This classic online assignment problem can be solved with Hungarian, network‑flow, or mixed‑integer programming methods when the data scale is small and global information is known. However, in internet‑scale ad systems the data is massive and per‑request global information is unavailable.

Industry practice transforms the problem via Lagrangian duality to obtain offline dual variables, then recovers a compact online solution. Yahoo’s HWM and SHALE algorithms are classic examples. We further refine the model by splitting the supply side to the user granularity, introducing variables for each user’s request count (

) and the number of ads that can be served per request (

). Additional variables capture user‑contract preference (

), contract‑side shortage (

), and penalty for shortage (

). The objective combines three parts: an L2 regularization term to keep allocation probabilities consistent across ads, a shortage penalty term, and a revenue term weighted by a hyper‑parameter (

) for trade‑off between completion rate and profit.

The model includes five constraints: demand satisfaction, supply probability bounds, frequency caps per user, and slot capacity limits.

3. System Implementation

3.1 System Overview

The architecture consists of a CRM management system for advertiser orders, an offline optimization framework, and an online engine. Offline, contract information is synchronized with logs and solved using a PS‑based distributed optimizer; results are stored in a model and pushed to the online system. Online, a Merger component handles RTP, pacing, and model services to deliver ads to the front‑end.

3.2 Offline Optimization

Stage 1 transforms the primal problem into its dual form and solves for dual variables in parallel using KKT conditions, similar to SHALE. User interaction metrics are incorporated, and two equations are solved (shown as images). Approximate updates use a learning rate

Stage 2 accelerates priority handling by constructing a DAG of contracts based on non‑overlapping audience groups, performing a topological sort, and executing updates in parallel batches, reducing iteration count dramatically (e.g., from 8 to 4 batches).

3.3 Online Pacing

Traditional GD systems use roulette‑wheel or greedy online selection, which fixes delivery probabilities and makes them sensitive to traffic forecasts, causing shortage or over‑delivery. Our pacing module adjusts delivery rates in real‑time at the minute level to match the observed traffic distribution. A dynamic threshold

determines whether a contract’s allocation probability is set to zero for the current request, ensuring smooth pacing.

The final online probability is computed by combining offline dual variables, real‑time scoring, supply‑demand ratio, priority vector, and hyper‑parameter

, then fed into the pacing service.

4. Experimental Results

4.1 Offline Experiments

We evaluated on Taobao banner and “Guess You Like” contract datasets, comparing against HWM and SHALE. Metrics include delivery completion rate, penalty, L2‑norm, and average CPC. Our method reduced CPC by ~50 % while incurring modest losses in completion rate and L2‑norm. Parallel DAG scheduling accelerated dual‑variable computation by up to 14× on large tasks.

Learning‑rate tuning showed that rates between 0.5 and 0.7 balance convergence speed and stability. Hyper‑parameter λ controls the trade‑off: larger λ lowers CPC but can hurt completion rate; we typically keep completion loss within 1‑3 %.

4.2 Online Experiments

Live A/B tests in the mobile Taobao scenario compared our method with a greedy baseline, SHALE, and SHALE‑CLICK (click‑aware without pacing). Our pacing‑enhanced approach achieved the smoothest delivery curve, the highest cumulative click‑through rate, and a 22.7 % CTR improvement over baseline (21.2 % over SHALE, 10.6 % over SHALE‑CLICK).

5. Conclusion

By integrating user‑ad interaction metrics into the allocation objective and designing both a distributed offline optimizer and a real‑time pacing controller, we significantly improve guaranteed‑delivery ad efficiency and contract fulfillment. The online bucket experiments confirm that our solution delivers higher platform revenue while enhancing advertiser brand impact. This work was accepted as a long paper at ICDM’19 under the title “Large‑Scale Personalized Delivery for Guaranteed Display Advertising with Real‑Time Pacing”.

Advertising allocation distributed optimization real-time pacing large-scale

Written by

Alibaba Cloud Developer

Alibaba's official tech channel, featuring all of its technology innovations.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.