Do AI-Generated Images Invert Aesthetic Preferences? ICML 2026 Spotlight

The ICML 2026 spotlight paper argues that universal aesthetic alignment in image‑generation models narrows artistic expression, presents six interrelated concerns, and demonstrates through extensive prompts and benchmark tests that reward models and aligned generators stubbornly favor homogenized, overly positive imagery while failing to honor anti‑aesthetic or negative‑emotion requests.

Machine Heart
Machine Heart
Machine Heart
Do AI-Generated Images Invert Aesthetic Preferences? ICML 2026 Spotlight

UBC and Weathon Software researchers present a paper titled Position: Universal Aesthetic Alignment Narrows Artistic Expression (arXiv:2512.11883) that argues the pursuit of a single, developer‑defined aesthetic standard limits the expressive diversity of AI‑generated images.

From Correctness to Aesthetic Alignment

Early image‑generation models produced physically implausible results (e.g., eight fingers, distorted faces). After fixing correctness, developers shifted focus to making images “more beautiful” for humans, leading to the creation of evaluation models such as ImageReward, HPSv2, and HPSv3, which are now widely used to align generators toward popular aesthetic preferences.

Six Interrelated Concerns

Developer‑preset universal standards may erode users' personalized aesthetic rights. The paper questions whether a single standard truly serves diverse user tastes or merely satisfies developers' risk‑avoidance motives.

Implicit bias in standard design. Even without explicit intent, developers’ data selection, annotation practices, and model choices embed a narrow view of “good” images—e.g., HPSv3 annotators are mostly young, favoring youthful aesthetics, and must pass expert‑agreement tests that reinforce a limited framework.

Conflict between individual and group preferences. A universal standard imposed on all users can override explicit individual wishes, leading to homogenized outputs that push users’ tastes toward the model’s bias.

Over‑beautification hides reality. Models constrained to produce flawless, vibrant images may suppress “ugly” or gritty styles, ignoring real‑world nuance.

Excessive positivity bias. Reward models tend to give higher scores to bright, high‑contrast, positively‑emotional images while penalizing negative‑emotion or dark‑style pictures, distorting emotional expression.

Single‑score aesthetic reduces artistic diversity. Reducing multifaceted aesthetics to one numeric reward compresses a rich artistic landscape into a uniform direction.

Testing Model Stubbornness

The authors crafted 300 prompts derived from COCO, injecting “anti‑aesthetic” dimensions (dim lighting, clashing colors, disproportion, negative emotion) using Qwen3 to generate corresponding images. These prompts were fed to mainstream image‑generation families, both with and without additional aesthetic alignment, as well as to reward models.

Reward Model Evaluation

For each prompt, a baseline COCO‑generated image and an anti‑aesthetic image were presented to reward models (HPSv3, HPSv2.1) alongside simple image‑text matchers (BLIP, CLIP). Results show that even the latest reward models almost never select the anti‑aesthetic image, whereas CLIP and BLIP correctly identify it, confirming that the failure is not due to prompt complexity.

Generation Capability Assessment

Using the same prompts, the authors scored generated images with reward models. All aligned models showed reduced anti‑aesthetic ability except Nano Banana, which retained a strong capacity (score gap of 9.351 on HPSv3) to produce the desired anti‑aesthetic outputs.

Real‑Image Benchmark

From the AVA dataset, authentic anti‑aesthetic photographs were selected. LLM‑generated “clean” versions of these photos were also created. HPSv3 consistently preferred the clean AI versions over the genuine anti‑aesthetic photos, highlighting a bias toward conventional aesthetics even on real images.

Emotion Bias in Aesthetic Alignment

When asked to generate images expressing happiness, anger, sadness, or fear, Nano Banana and HPSv3 overwhelmingly favored the positive‑emotion images; HPSv3’s accuracy for negative emotions fell below random chance. Generation tests showed similar trends: models aligned to aesthetic standards produced smiling, bright depictions even when prompts demanded grim war scenes, thereby muting critical or negative artistic expression.

Overall, the paper demonstrates that universal aesthetic alignment not only narrows stylistic variety but also embeds positive‑emotion bias, raising ethical concerns about the future of AI‑driven visual creativity.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

image generationethical AIICML 2026aesthetic alignmentAI-generated artreward models
Machine Heart
Written by

Machine Heart

Professional AI media and industry service platform

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.