Artificial Intelligence 21 min read

Applying Large Language Models for AIGC Advertising: Content Generation, Multimodal Understanding, and Creative Optimization at Ximalaya

Ximalaya leverages large language models and AI‑generated content to automate ad creative production, multimodal semantic understanding, and creative selection, slashing image costs to 0.2 CNY, boosting CTR by up to 3.5 %, improving revenue and eCPM by over 2 %, and expanding material diversity fivefold.

Ximalaya Technology Team
Ximalaya Technology Team
Ximalaya Technology Team
Applying Large Language Models for AIGC Advertising: Content Generation, Multimodal Understanding, and Creative Optimization at Ximalaya

The rapid development of artificial intelligence and large language models (LLMs) has transformed many industries. In the advertising domain of Ximalaya, AIGC (Artificial Intelligence Generated Content) is leveraged to produce diverse ad creatives, improve content understanding, and enhance personalized material distribution.

Background : Large language models enable machines to understand and generate human language with high precision, giving rise to AIGC as a new content production paradigm that surpasses traditional PGC and UGC. AIGC can generate text, images, audio, video, and 3D content, supporting multimodal fusion and cross‑modal generation.

AIGC Ad Creative Production : Traditional ad creative creation is costly (100‑200 CNY per image) and slow. By using LLMs for prompt generation, negative‑prompt control for safety, and AIGC for text and image generation, Ximalaya achieved a 3.4 % increase in CTR for generated image materials. Two image‑expansion solutions were introduced: (1) template separation with AI‑generated backgrounds, and (2) edge‑expansion that preserves original artistic elements while adapting to various aspect ratios, yielding an additional 3.5 % CTR lift. Production cost per image dropped to 0.2 CNY, and material volume increased fivefold.

Multimodal Content Understanding : A semantic understanding system was built to process text, images, and landing pages. LLM‑based prompt engineering (CO‑STAR framework) reduced tag‑generation time from days to two hours. Text tags are fed into CTR and CVR models, improving offline AUC by 3 ‰ (CTR) and 1 ‰ (CVR). Image understanding now uses LLMs for multimodal inference, replacing pipelines of segmentation, detection, classification, and OCR, and achieving a 1.2 % CTR gain. Landing‑page comprehension switched from custom crawlers to long‑screenshot + LLM prompting, extracting over 30 semantic tags and boosting CVR AUC by 1.3 ‰.

Creative Selection (Creative Optimisation) : Four evolution stages were implemented. Stage 1 introduced post‑ranking UCB exploration to address cold‑start and performance bottlenecks, increasing self‑served revenue by +2.7 % and eCPM by +2.8 %. Stage 2 replaced UCB with a dedicated ranking model, focusing on lightweight single‑tower architecture and generalized features. Stage 3 moved the selection model upstream and adopted a dual‑tower DSSM to handle the larger candidate set, delivering +1.8 % revenue and +2.4 % eCPM. Stage 4 added AIGC material management, versioning, and automated淘汰/回补 logic, resulting in a further 1.3 % CTR lift. Overall, the pipeline increased material diversity fivefold, overall consumption growth >7 %, and eCPM improvement of 6 % with negligible latency impact (99‑percentile latency 3.13 ms).

Summary and Outlook : The integration of LLMs and AIGC into Ximalaya’s advertising system has demonstrated significant gains in creative diversity, model performance, and revenue. Future directions include joint multimodal representation learning, end‑to‑end generative ad selection, deeper LLM reasoning for new ads and users, and generative recall techniques.

advertisingmachine learningLarge Language ModelsAIGCcreative optimizationMultimodal Understanding
Ximalaya Technology Team
Written by

Ximalaya Technology Team

Official account of Ximalaya's technology team, sharing distilled technical experience and insights to grow together.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.