Intelligent Creative System at Hello: Business Background, Architecture, Implementation, and Reflections
This article presents Hello's Intelligent Creative project, detailing its business motivations, system architecture, algorithmic choices such as seq2seq, VAE, GAN, and pre‑trained models, the implementation of material libraries, tagging, recall strategies, a creative racing model, performance gains, and future challenges.
The talk, led by senior algorithm engineer Pan Yunfeng from Hello, introduces the Intelligent Creative project, which aims to automatically generate, label, store, and distribute advertising creatives by leveraging deep learning and recommendation techniques.
Business background: modern marketing ads on platforms like Taobao and Pinduoduo require highly dynamic creatives that cannot be manually designed at scale; thus, algorithmic mixing, ordering, and user interest prediction are needed.
Content generation methods explored include seq2seq (with attention), VAE, GAN, and large pre‑trained models (e.g., GPT) to produce short, high‑information ad copy.
Content understanding covers multi‑modal classification, tag extraction, and quality detection for text and images, forming the basis for downstream recall and ranking.
Creative selection combines CTR estimation models and cold‑start strategies to match the right creative to each user.
System architecture: a pipeline that integrates material libraries, tag‑based retrieval, dynamic phrase banks, and a creative racing (A/B testing) mechanism, with the algorithmic components highlighted in orange in the diagram.
Implementation details: a unified material library with admission rules, automated tagging via multi‑label classifiers, dynamic phrase construction, and a racing model that uses GBM and an explore‑and‑exploit (E&E) strategy to surface high‑CTR creatives.
Modeling challenges include short‑text classification, limited labeled samples, and transfer learning approaches (freeze‑all, feature extraction, fine‑tuning) using ALBERT as the backbone.
Recall strategies comprise business‑line recall, category recall, hot‑item recall, and manual configuration to ensure sufficient candidate pool.
The creative racing model selects the best creative per user, using GBM for CTR prediction and handling sparse feature scenarios.
System UI provides operators with fine‑grained control over creative elements, dynamic phrase insertion, and recommendation tools, despite a modest visual design.
Results show 200‑300% lift in key metrics across nine business lines, covering 60% of traffic, and a 32% increase in per‑user revenue for specific campaigns.
Reflections highlight remaining issues: material scarcity, insufficient cross‑feature interaction between users and creatives, and lack of robust image classification and quality assessment.
The Q&A section addresses pre‑training effectiveness on limited data, manual versus automated material creation, and the role of operators in the creative workflow.
DataFunSummit
Official account of the DataFun community, dedicated to sharing big data and AI industry summit news and speaker talks, with regular downloadable resource packs.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.