Artificial Intelligence 17 min read

AI‑Driven Element Selection for Advertising Video Creative Generation

This article explains how Tencent's advertising system leverages multimedia AI techniques—including multi‑armed bandit, pairwise learning, and DeepFM models—to automatically select optimal templates, music, and stickers for image and video assets, thereby reducing production cost, improving creative quality, and boosting ad performance.

Tencent Advertising Technology
Tencent Advertising Technology
Tencent Advertising Technology
AI‑Driven Element Selection for Advertising Video Creative Generation

Background : With the rise of short‑video platforms, advertising assets have shifted from text to images and now to video, which offers richer user engagement but incurs higher production costs. Tencent upgraded its ad system’s foundation to improve understanding and computation, launching an intelligent video creation engine that combines original assets with templates, stickers, music, and effects to generate multiple creative variations.

Element Selection : Manually pairing assets with elements is subjective and labor‑intensive. An AI‑driven element selection capability automatically matches the most suitable elements to original assets, enhancing creative quality and scaling production.

2.1 Multi‑Armed Bandit (MAB) : The problem is framed as a recommendation task where each template is an arm. Rewards are defined by post‑deployment metrics such as CTR or win‑rate. Thompson sampling with Beta distributions balances exploration of new or less‑used templates against exploitation of proven ones, and models can be customized per industry or traffic segment.

2.2 Pairwise Learning : To capture fine‑grained compatibility (e.g., color harmony, semantic fit), a pairwise contrastive model is trained on pairs of the same source asset paired with different templates. The loss reflects the relative performance difference of the two generated creatives, allowing the model to learn nuanced matching scores.

2.3 Rich Multimodal Input : Beyond visual embeddings of assets and templates, textual metadata (titles, OCR) and audio features (music) are incorporated. Sparse features such as industry, traffic, and advertiser are fed into a DeepFM architecture, enabling second‑order interactions between dense multimodal embeddings and categorical attributes.

2.4 Optimization Objective : Instead of optimizing solely for CTR, the system targets the creative‑level selection pass‑rate, which correlates more directly with downstream exposure and yields a ten‑fold increase in training samples.

2.5 Confidence Evaluation : Offline validation uses consumption uplift as a metric, but statistical significance is assessed via Monte‑Carlo simulations and hypothesis testing to ensure observed gains are not due to random variance, complemented by separate test‑set evaluation.

2.6 Model Auto‑Update : A daily pipeline automatically collects new data, retrains models, evaluates performance, and deploys updates, forming a closed‑loop that maintains model relevance as new templates and assets appear.

Summary : Through iterative improvements in modeling, objectives, input richness, and evaluation, Tencent built an effect‑driven element selection system that powers the intelligent creation engine, delivering higher‑quality ad creatives. Online A/B tests show a >5% lift in consumption compared to rule‑based generation.

Future Directions : Incorporate user‑level targeting signals, explore multi‑task learning that balances selection pass‑rate, CTR, and consumption, and further enrich contextual features to refine element matching.

multimediapairwise learningdeepFMadvertising AIelement selectionMAB
Tencent Advertising Technology
Written by

Tencent Advertising Technology

Official hub of Tencent Advertising Technology, sharing the team's latest cutting-edge achievements and advertising technology applications.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.