Applying Large Language Models to Search Advertising Satisfaction: From DNN to ERNIE and Prompt Learning
The article details how Baidu's Fengchao team leverages large language models, including a transition from DNN embeddings to ERNIE, introduces multi‑level tokenization and discrete core‑word inputs, and applies prompt learning and AIGC techniques to improve search advertising satisfaction and industry‑specific relevance modeling.
The presentation explains the gap between industrial and research practice, emphasizing that technology choices must directly address real business problems such as search advertising satisfaction, which measures how well ads meet user intent and client service quality.
It describes the traditional advertising CTR pipeline—log parsing, DNN embedding of massive sparse features, and training of a dense prediction model—and how the team migrated from DNN to the ERNIE language model to handle long, noisy landing‑page texts.
Key challenges include high‑noise, fragmented content and quadratic performance growth; conventional solutions (GPU migration, model distillation, pruning) were insufficient, leading to two efficiency measures: adapting discrete core‑word sets for sequence models and designing a multi‑level tokenization hierarchy that reduces token length while preserving semantics.
Prompt learning is introduced to achieve industry isolation: soft prompt tokens representing industry IDs are added to the model, masked during pre‑training, and injected during fine‑tuning, enabling the model to adapt to changing industry standards without sacrificing overall performance.
The talk also covers model architecture choices—single‑tower versus dual‑tower relevance models—and how virtual prompts can align dual‑tower training with single‑tower pre‑training objectives.
Finally, the potential of AIGC is discussed, including automated ad‑material generation, explainable debugging tools, and system‑level LLM reward models, all aimed at creating a virtuous loop that improves ad quality and business outcomes.
A Q&A segment clarifies implementation details such as how industry isolation is achieved, the role of prompts in training data, handling of long texts, and the comparative benefits of tokenization optimizations.
DataFunTalk
Dedicated to sharing and discussing big data and AI technology applications, aiming to empower a million data scientists. Regularly hosts live tech talks and curates articles on big data, recommendation/search algorithms, advertising algorithms, NLP, intelligent risk control, autonomous driving, and machine learning/deep learning.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.