Artificial Intelligence 8 min read

Improving Advertisement Image Generation with a Multimodal Reliable Feedback Network (ECCV 2024)

The paper introduces a Multimodal Reliable Feedback Network (RFNet) and a consistency‑condition regularization technique that together boost the usable rate of automatically generated advertisement images while preserving visual quality, supported by a new million‑image annotated dataset and extensive ECCV‑2024 experiments.

JD Retail Technology
JD Retail Technology
JD Retail Technology
Improving Advertisement Image Generation with a Multimodal Reliable Feedback Network (ECCV 2024)

In e‑commerce, attractive advertisement images are crucial, but generated images often fail to meet advertising standards, leading to costly manual review.

This work, accepted at ECCV 2024, proposes a Multimodal Reliable Feedback Network (RFNet) that automatically evaluates generated ad images and integrates its feedback into a cyclic generation process, markedly increasing the proportion of usable images without sacrificing visual appeal.

RFNet fuses multiple auxiliary modalities (e.g., product semantics, background context) to assess image suitability; its output is used to fine‑tune a diffusion model via a novel Reliable Feedback Fine‑Tuning (RFFT) method that employs a consistency‑condition loss (L_CC) to keep text‑condition gradients stable while steering the model toward higher usability.

To train RFNet, the authors constructed the RF1M dataset, containing over one million human‑annotated generated advertisement images, providing reliable feedback for model supervision.

Extensive experiments demonstrate that RFNet outperforms baselines on all evaluation metrics, and RFFT achieves higher usable‑image rates while maintaining aesthetic quality, reducing the number of generation attempts and overall production time.

The fine‑tuned ControlNet also shows strong generalization when combined with various LoRA and diffusion model weights, further confirming the robustness of the proposed approach.

Paper: https://arxiv.org/abs/2408.00418 Code: https://github.com/ZhenbangDu/Reliable_AD

AImultimodaldiffusion modelsadvertisement generationECCV2024reliable feedback network
JD Retail Technology
Written by

JD Retail Technology

Official platform of JD Retail Technology, delivering insightful R&D news and a deep look into the lives and work of technologists.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.