Tag

Image Generation

0 views collected around this technical thread.

DevOps
DevOps
Apr 13, 2025 · Artificial Intelligence

The Amazing Magic of GPT‑4o and a Speculative Technical Roadmap

This article reviews the breakthrough image‑generation capabilities of GPT‑4o, showcases diverse examples, and offers a detailed speculation on its underlying autoregressive architecture, tokenization methods, VQ‑VAE/GAN advances, and training strategies that could explain its performance.

AI researchGPT-4oImage Generation
0 likes · 16 min read
The Amazing Magic of GPT‑4o and a Speculative Technical Roadmap
Ele.me Technology
Ele.me Technology
Apr 10, 2025 · Artificial Intelligence

Ele.me Vertical Business AIGC Image Model: Architecture, Training Pipeline, and Evaluation

Ele.me created a domain-specific AIGC image model built from scratch on its own data using the DiT backbone, a three-stage training pipeline (transformer pre-training, prompt alignment, aesthetic fine-tuning), custom T5‑E‑CLIP text and visual encoders, ControlNet for layout control, and evaluated via FID, CLIP scores and a human rubric, enabling automated dish-image generation and UI asset creation for its vertical business.

AIGCControlNetDiT
0 likes · 8 min read
Ele.me Vertical Business AIGC Image Model: Architecture, Training Pipeline, and Evaluation
Beijing SF i-TECH City Technology Team
Beijing SF i-TECH City Technology Team
Apr 7, 2025 · Artificial Intelligence

Common Applications, Tools, and Practical Scenarios of AIGC in Design and Business

This article outlines the rapid growth of AIGC technologies, describes key image‑generation and language models, demonstrates step‑by‑step design workflows, explores user‑experience research enhancements, and envisions future business uses while offering practical tips for mastering AI‑generated content.

AIGCArtificial IntelligenceDesign
0 likes · 8 min read
Common Applications, Tools, and Practical Scenarios of AIGC in Design and Business
Nightwalker Tech
Nightwalker Tech
Mar 28, 2025 · Artificial Intelligence

Comprehensive Evaluation of GPT-4o Multimodal Image Generation Capabilities

This article presents a thorough assessment of GPT‑4o’s new image generation features, detailing multiple test scenarios—from simple portrait creation and style transfer to UI design, product rendering, and educational illustrations—comparing its output with Claude‑3.7‑Sonnet, highlighting strengths in realism and weaknesses in Chinese text handling.

AI evaluationGPT-4oImage Generation
0 likes · 16 min read
Comprehensive Evaluation of GPT-4o Multimodal Image Generation Capabilities
JD Retail Technology
JD Retail Technology
Mar 25, 2025 · Artificial Intelligence

2024 Advances in Advertising Creative Generation and Selection

In 2024 the advertising team deployed an end‑to‑end AIGC pipeline that automatically creates high‑quality ad images, uses the multimodal Reliable Feedback Network and the million‑size RF1M dataset to filter outputs, builds rich offline and online multimodal representations with contrastive and list‑wise learning, and optimizes ranking architecture to deliver scalable, personalized creative selection.

AIAIGCImage Generation
0 likes · 10 min read
2024 Advances in Advertising Creative Generation and Selection
Rare Earth Juejin Tech Community
Rare Earth Juejin Tech Community
Mar 24, 2025 · Artificial Intelligence

AI SDK 4.2 Release: New Reasoning, MCP Client, useChat Message Components, Image Generation, URL Sources, and Provider Updates

The AI SDK 4.2 release introduces powerful new features such as step‑by‑step reasoning support, a Model Context Protocol (MCP) client for tool integration, useChat message components, multimodal image generation, standardized URL sources, OpenAI Responses API support, Svelte 5 compatibility, and numerous middleware and provider enhancements, all illustrated with practical JavaScript/TypeScript examples.

AI SDKImage GenerationJavaScript
0 likes · 19 min read
AI SDK 4.2 Release: New Reasoning, MCP Client, useChat Message Components, Image Generation, URL Sources, and Provider Updates
JD Tech Talk
JD Tech Talk
Mar 19, 2025 · Artificial Intelligence

Reliable Advertising Image Generation and Creative Selection Using Multimodal Feedback and MLLM Representations

The 2024 advertising team introduced a suite of AI‑driven techniques—including a trustworthy feedback network, a large‑scale human‑annotated dataset, multimodal large language model representations, and online ranking architecture upgrades—to dramatically improve the quality, coverage, and personalization of generated ad creatives.

AIGCImage GenerationMLLM
0 likes · 10 min read
Reliable Advertising Image Generation and Creative Selection Using Multimodal Feedback and MLLM Representations
Code Mala Tang
Code Mala Tang
Jan 30, 2025 · Artificial Intelligence

Is Janus-Pro the Open‑Source Rival to DALL·E 3? A Deep Dive Review

This article reviews DeepSeek's Janus‑Pro image model, explains its multimodal architecture, benchmarks it against DALL·E 3 and Stable Diffusion, provides usage instructions and inference code, and offers a critical assessment of its image quality and practical limitations.

AI modelImage GenerationJanus-Pro
0 likes · 12 min read
Is Janus-Pro the Open‑Source Rival to DALL·E 3? A Deep Dive Review
DaTaobao Tech
DaTaobao Tech
Dec 16, 2024 · Artificial Intelligence

Reference Image Generation for Subject‑Driven Diffusion

This work presents a subject‑driven diffusion pipeline that injects multi‑scale reference features (ReferenceNet‑style) into high‑fidelity backbones such as SD‑XL and Flux, enabling zero‑shot, fine‑grained product consistency across diverse scenes and outperforming current fine‑tuned and zero‑shot methods while noting limits in category coverage and human interactions.

AIDreamboothIP-Adapter
0 likes · 9 min read
Reference Image Generation for Subject‑Driven Diffusion
DataFunTalk
DataFunTalk
Dec 5, 2024 · Artificial Intelligence

VAR: Scalable Image Generation via Next‑Scale Prediction Wins NeurIPS 2024 Best Paper

The VAR model, a Visual AutoRegressive framework that introduces a novel multi‑scale “next‑scale prediction” paradigm, dramatically improves image generation efficiency and quality, surpasses diffusion models, validates scaling laws in vision, and earned the Best Paper award at NeurIPS 2024.

Image GenerationNeurIPS2024autoregressive models
0 likes · 7 min read
VAR: Scalable Image Generation via Next‑Scale Prediction Wins NeurIPS 2024 Best Paper
Alimama Tech
Alimama Tech
Nov 27, 2024 · Artificial Intelligence

FlowDCN: Efficient Arbitrary-Resolution Image Generation via Groupwise Multi‑Scale Deformable Convolution

FlowDCN introduces Groupwise‑MSDCN, a sparse deformable convolution that replaces attention, enabling efficient arbitrary‑resolution image generation with linear complexity, fewer parameters and FLOPs, and achieving state‑of‑the‑art FID scores on ImageNet while requiring far fewer training steps.

Image Generationarbitrary resolutiondeformable convolution
0 likes · 12 min read
FlowDCN: Efficient Arbitrary-Resolution Image Generation via Groupwise Multi‑Scale Deformable Convolution
JD Tech
JD Tech
Nov 15, 2024 · Artificial Intelligence

Reliable Feedback Network (RFNet) for Improving Usable Advertising Image Generation

The paper proposes a multimodal Reliable Feedback Network (RFNet) and a consistency‑regularized fine‑tuning method (RFFT) that dramatically increase the proportion of usable advertising images generated by diffusion models while preserving visual appeal, and introduces the large‑scale RF1M dataset for training and evaluation.

Image GenerationRFNetadvertising images
0 likes · 9 min read
Reliable Feedback Network (RFNet) for Improving Usable Advertising Image Generation
360 Tech Engineering
360 Tech Engineering
Oct 31, 2024 · Artificial Intelligence

HiCo: Hierarchical Controllable Diffusion Model for Layout-to-Image Generation

The paper introduces HiCo, a hierarchical controllable diffusion model that enables precise layout‑to‑image generation by decoupling object and background features through weight‑shared branches and a fusion module, achieving high‑quality results and efficient inference as demonstrated on the HiCo‑7K benchmark.

AI PaintingHiCoImage Generation
0 likes · 9 min read
HiCo: Hierarchical Controllable Diffusion Model for Layout-to-Image Generation
DataFunSummit
DataFunSummit
Oct 10, 2024 · Artificial Intelligence

AIGC‑Assisted Marketing Material Generation at Shujia Technology

This article describes Shujia Technology's use of artificial intelligence to generate marketing images and videos, outlining the background, challenges of high-volume content production, detailed solutions for image and video assets—including layout models, diffusion models, and digital human synthesis—and future research directions.

AIGCDigital HumanImage Generation
0 likes · 12 min read
AIGC‑Assisted Marketing Material Generation at Shujia Technology
DaTaobao Tech
DaTaobao Tech
Sep 25, 2024 · Artificial Intelligence

Consistent Style Generation in AIGC: Style Aligned and Story Diffusion

The article reviews two AIGC techniques—Style Aligned, which shares self‑attention across a batch to keep style consistent, and Story Diffusion, which uses a training‑free Consistent Self‑Attention module followed by a transformer to generate coherent image sequences—showing promising results in home‑decoration scenarios while noting remaining challenges in fine‑grained spatial and detail alignment.

AIAIGCConsistent Self-Attention
0 likes · 5 min read
Consistent Style Generation in AIGC: Style Aligned and Story Diffusion
Xiaohongshu Tech REDtech
Xiaohongshu Tech REDtech
Sep 19, 2024 · Artificial Intelligence

Target-Driven Distillation (TDD): A Multi‑Goal Distillation Method for Accelerating Diffusion Models

Target‑Driven Distillation (TDD) is a multi‑goal distillation method that flexibly selects short‑range target steps and decouples guidance during training, enabling 4‑to‑8‑step diffusion generation that preserves high‑resolution detail, works with LoRA, ControlNet, InstantID, and outperforms existing consistency distillation techniques in speed and quality.

AI accelerationImage Generationdiffusion models
0 likes · 9 min read
Target-Driven Distillation (TDD): A Multi‑Goal Distillation Method for Accelerating Diffusion Models
Qunar Tech Salon
Qunar Tech Salon
Aug 8, 2024 · Backend Development

Using Satori and Resvg (or Sharp) for Efficient Backend Image Generation: Architecture, Implementation, and Optimizations

This article examines various image‑generation approaches, compares web‑frontend, client‑side, and backend methods, introduces a new Node‑backend solution based on Satori to convert HTML to SVG and then to PNG with Resvg (later Sharp), and details performance and memory optimizations that dramatically improve speed, resource usage, and stability for large‑scale image‑service deployments.

Image GenerationNode.jsResvg
0 likes · 14 min read
Using Satori and Resvg (or Sharp) for Efficient Backend Image Generation: Architecture, Implementation, and Optimizations
JD Tech Talk
JD Tech Talk
Jul 9, 2024 · Artificial Intelligence

Getting Started with AI Image Generation Using Stable Diffusion for Promotional Posters

This guide introduces the fundamentals of AI image generation with Stable Diffusion, covering three main usage methods, the Draw Things desktop app, model types, samplers, prompts, and post‑processing techniques to create high‑quality promotional graphics for events like the 618 sale.

AI artDrawThingsImage Generation
0 likes · 11 min read
Getting Started with AI Image Generation Using Stable Diffusion for Promotional Posters
JD Tech
JD Tech
Jul 9, 2024 · Artificial Intelligence

A Beginner’s Guide to AI Image Generation with Stable Diffusion: Tools, Models, and Techniques

This article introduces the fundamentals of AI image generation using Stable Diffusion, covering model basics, three practical ways to access the technology, detailed explanations of model types, samplers, parameters, prompt engineering, and post‑processing techniques for creating high‑quality promotional graphics.

AI artImage GenerationStable Diffusion
0 likes · 11 min read
A Beginner’s Guide to AI Image Generation with Stable Diffusion: Tools, Models, and Techniques
Rare Earth Juejin Tech Community
Rare Earth Juejin Tech Community
Jun 7, 2024 · Backend Development

Building a Fish Knowledge Quiz Bot and a Nostalgic Chess Game with Coze: Workflow, Database, and Node.js Image Generation

This article explains how to create two interactive Coze bots—a fish‑knowledge quiz and a nostalgic 147‑258‑369 chess game—by designing workflows, integrating large language models, using databases, and implementing Node.js code for canvas drawing and GIF generation, complete with deployment tips.

AICozeDatabase
0 likes · 10 min read
Building a Fish Knowledge Quiz Bot and a Nostalgic Chess Game with Coze: Workflow, Database, and Node.js Image Generation