Tagged articles
31 articles
Page 1 of 1
SuanNi
SuanNi
May 7, 2026 · Artificial Intelligence

DreamLite: A 0.39B Mobile Model Matching Z‑Image for Real‑Time Text‑to‑Image Generation and Editing

DreamLite is a compact 0.39 B unified diffusion model open‑sourced by ByteDance that runs on smartphones, delivering text‑to‑image generation and text‑guided editing in about three seconds for 1024×1024 pictures, with performance comparable to Flux, Z‑Image and LongCat‑Image and offering two variants to balance fidelity and latency.

AI modelByteDanceDreamLite
0 likes · 4 min read
DreamLite: A 0.39B Mobile Model Matching Z‑Image for Real‑Time Text‑to‑Image Generation and Editing
Geek Labs
Geek Labs
Apr 28, 2026 · Artificial Intelligence

ChatGPT and AI Tool Open-Source Projects: Multi-Account Scheduling, Image Editing API, AWS Auto-Registration

This article introduces four GitHub open‑source projects—gpt2api, chatgpt2api, kiro-auto, and hermes‑webui—that enable high‑concurrency multi‑account ChatGPT usage, DALL‑E image generation and editing, automated AWS Builder ID registration, and cross‑platform access to Hermes agents, each with usage instructions and target audiences.

AI toolsAWS automationChatGPT
0 likes · 7 min read
ChatGPT and AI Tool Open-Source Projects: Multi-Account Scheduling, Image Editing API, AWS Auto-Registration
JD Cloud Developers
JD Cloud Developers
Apr 8, 2026 · Artificial Intelligence

How JoyAI-Image-Edit Brings Spatial Intelligence to Open‑Source Image Editing

JoyAI-Image-Edit, an open‑source multimodal foundation model from JD Research Institute, integrates text‑to‑image generation, image understanding, and instruction‑driven spatial editing, achieving world‑leading spatial perception and editing capabilities that unlock new applications across e‑commerce, robotics, 3D reconstruction, and design.

Computer VisionGenerative ModelsMultimodal AI
0 likes · 7 min read
How JoyAI-Image-Edit Brings Spatial Intelligence to Open‑Source Image Editing
Machine Learning Algorithms & Natural Language Processing
Machine Learning Algorithms & Natural Language Processing
Mar 15, 2026 · Artificial Intelligence

HY‑WU: Real‑Time Adaptive AI Model That Generates Parameters On‑The‑Fly

HY‑WU demonstrates that generating model parameters dynamically during inference enables a single foundation model to perform diverse image‑editing tasks, outperforming fixed‑parameter baselines in human and automatic evaluations, benchmark tests, and conflict‑task experiments, highlighting a practical real‑time adaptation approach for AI systems.

HY-WULoRATransformer
0 likes · 16 min read
HY‑WU: Real‑Time Adaptive AI Model That Generates Parameters On‑The‑Fly
AIWalker
AIWalker
Mar 8, 2026 · Artificial Intelligence

FireRed-Image-Edit v1.1 Boosts OOTD Element Fusion and Portrait Consistency

The Super Intelligence team at Xiaohongshu unveils FireRed-Image-Edit v1.1, an open‑source image‑editing model that dramatically improves ID‑consistent edits, multi‑element OOTD fusion, portrait makeup, and font style rendering while delivering end‑to‑end generation in 4.5 seconds on 30 GB VRAM, backed by a full training‑distillation pipeline and a technical report on arXiv.

AI modelFireRed-Image-EditLoRA
0 likes · 10 min read
FireRed-Image-Edit v1.1 Boosts OOTD Element Fusion and Portrait Consistency
SuanNi
SuanNi
Mar 7, 2026 · Artificial Intelligence

How HY‑WU Enables Real‑Time Dynamic Parameters for Large‑Scale AI Models

Tencent's HY‑WU architecture introduces functional memory that generates task‑specific parameters on the fly, overcoming catastrophic forgetting and static‑weight limitations, and demonstrates superior performance in image‑editing benchmarks compared to leading open‑source and closed‑source models.

AI ArchitectureTencentdynamic parameters
0 likes · 12 min read
How HY‑WU Enables Real‑Time Dynamic Parameters for Large‑Scale AI Models
SuanNi
SuanNi
Feb 23, 2026 · Artificial Intelligence

How FireRed-Image-Edit Sets New Standards for AI-Powered Image Editing

FireRed-Image-Edit, an open‑source instruction‑driven diffusion model, combines massive high‑quality data, a dual‑stream multimodal architecture, progressive training, and a comprehensive multi‑dimensional benchmark to achieve unprecedented pixel‑level control and human‑like editing performance across diverse visual tasks.

AITraining Strategiesdata engineering
0 likes · 12 min read
How FireRed-Image-Edit Sets New Standards for AI-Powered Image Editing
AI Algorithm Path
AI Algorithm Path
Feb 8, 2026 · Artificial Intelligence

Qwen Multi-Angle: An Open‑Source AI Tool for Full‑Perspective Image Reconstruction

The open‑source Qwen‑Image‑Edit‑2511‑Multiple‑Angles‑LoRA model can reconstruct images from 96 preset camera poses, letting users adjust distance, pitch and yaw to generate realistic multi‑angle views, with step‑by‑step usage instructions, example results, practical applications, and noted limitations.

AIQwenimage editing
0 likes · 6 min read
Qwen Multi-Angle: An Open‑Source AI Tool for Full‑Perspective Image Reconstruction
HyperAI Super Neural
HyperAI Super Neural
Dec 25, 2025 · Artificial Intelligence

How Qwen-Image-Layered Enables Precise, High‑Fidelity Image Layer Editing

The article introduces the Qwen‑Image‑Layered model, which solves the long‑standing AI image‑editing limitation of inseparable layers by decomposing images into independent RGBA layers that retain fidelity under scaling, repositioning and recoloring, and provides a step‑by‑step online tutorial to try the feature.

AI image generationHyperAI tutorialQwen-Image-Layered
0 likes · 5 min read
How Qwen-Image-Layered Enables Precise, High‑Fidelity Image Layer Editing
DeWu Technology
DeWu Technology
Dec 25, 2025 · Frontend Development

Build a High‑Performance H5 PAG Player: SDK, Image Editing, Batch Synthesis

This guide details how to implement a full‑stack H5 PAG player for the “Use Basketball to Know Me” activity, covering SDK loading, canvas‑based image manipulation (drag, scale, rotate), dynamic layer and text replacement, real‑time preview synchronization, snapshot export, batch synthesis, performance tuning, and fallback strategies.

Batch ProcessingCanvasPAG
0 likes · 30 min read
Build a High‑Performance H5 PAG Player: SDK, Image Editing, Batch Synthesis
Alimama Tech
Alimama Tech
Oct 15, 2025 · Artificial Intelligence

How Alibaba’s Taobao Starry Model Delivers Precise, Consistent E‑commerce Image Edits

Alibaba’s Taobao Starry Image Editing model tackles the e‑commerce challenge of maintaining visual consistency by introducing a high‑fidelity, plug‑in architecture, a million‑scale consistency dataset, and multi‑stage multilingual training, enabling precise, controllable edits without altering product layout or background.

ConsistencyE-commerce AIdata engineering
0 likes · 10 min read
How Alibaba’s Taobao Starry Model Delivers Precise, Consistent E‑commerce Image Edits
Code Mala Tang
Code Mala Tang
Sep 27, 2025 · Artificial Intelligence

5 Creative Ways to Edit Images with Google Nano Banana (Gemini 2.5 Flash)

This guide showcases five practical examples—removing objects, colorizing photos, adding billboard text, maintaining character consistency, and applying brand assets—demonstrating how Google Nano Banana’s advanced AI image editing can streamline visual design tasks.

AI artGemini 2.5Google AI
0 likes · 7 min read
5 Creative Ways to Edit Images with Google Nano Banana (Gemini 2.5 Flash)
AI Algorithm Path
AI Algorithm Path
Sep 3, 2025 · Artificial Intelligence

15 Real-World Applications of Google’s Nano Banana AI Image Tool

Google’s Nano Banana, an advanced multimodal AI model integrated into Gemini, delivers unprecedented role‑consistency and multi‑step editing, and this article walks through fifteen concrete use cases—from virtual try‑on and background swapping to style transfer, product visualisation, educational graphics, and 3D conversion—showcasing how the tool can streamline creative workflows across industries.

AI image generationGeminiGoogle
0 likes · 9 min read
15 Real-World Applications of Google’s Nano Banana AI Image Tool
21CTO
21CTO
Aug 28, 2025 · Artificial Intelligence

What Is Nano Banana? The Mysterious AI Image Model Challenging Google’s Gemini

Nano Banana, an enigmatic AI image‑generation model that surfaced on forums and Discord without any official announcement, boasts unprecedented speed, consistency, and language‑driven editing, sparking speculation about Google’s involvement and reshaping workflows across e‑commerce, gaming, education, and design.

AI image generationGoogle speculationNano Banana
0 likes · 10 min read
What Is Nano Banana? The Mysterious AI Image Model Challenging Google’s Gemini
AI Algorithm Path
AI Algorithm Path
Aug 24, 2025 · Artificial Intelligence

Qwen-Image-Edit: Alibaba’s Open‑Source State‑of‑the‑Art Image Editing Model

Qwen-Image-Edit, built on the 20B‑parameter Qwen‑Image foundation, introduces a dual‑path architecture that simultaneously understands semantic intent and visual details, enabling precise semantic and appearance edits, robust text manipulation, and fine‑grained region control, with open‑source weights on HuggingFace and benchmark‑proven superiority over existing models.

AI image manipulationQwen-Image-Editdiffusers
0 likes · 7 min read
Qwen-Image-Edit: Alibaba’s Open‑Source State‑of‑the‑Art Image Editing Model
AI Algorithm Path
AI Algorithm Path
Jul 2, 2025 · Artificial Intelligence

Exploring the Open‑Source Flux.1 Kontext Dev Model for Advanced Image Editing

Black Forest Labs releases the open‑source Flux.1 Kontext Dev model, a 12‑billion‑parameter image‑editing system whose weights are publicly available; the article details its core features, benchmark‑level performance comparable to leading commercial models, access via HuggingFace, and step‑by‑step usage through Fal AI and Replicate APIs.

AI modelFal AIFlux.1
0 likes · 9 min read
Exploring the Open‑Source Flux.1 Kontext Dev Model for Advanced Image Editing
大转转FE
大转转FE
Jun 30, 2025 · Mobile Development

How a Custom Android Image Editor Boosts Warehouse Efficiency

This article details the design and implementation of a native Android image‑editing component built for warehouse quality‑inspection, covering business motivations, core features such as multi‑image batch editing, matrix‑based transformations, a command‑pattern undo/redo system, technical architecture, key challenges, and future extension plans.

AndroidCommand PatternCustom View
0 likes · 29 min read
How a Custom Android Image Editor Boosts Warehouse Efficiency
AntTech
AntTech
Jun 15, 2025 · Artificial Intelligence

21 Ant Research Papers Shaping CVPR 2025: AI Image & Video Generation Breakthroughs

The Interactive Intelligence Lab of Ant Technology Research Institute presented 21 accepted CVPR 2025 papers covering visual generation, editing, 3D vision, digital humans and multimodal AI, highlighting tools such as MagicQuill, Lumos, Aurora, FLARE, LeviTor, MangaNinja, AniDoc, Mimir, AvatarArtist, DiffListener, MotionStone, TensorialGaussianAvatars, DualTalk, CompreCap and Uni-AD.

CVPR2025Computer VisionVideo Generation
0 likes · 20 min read
21 Ant Research Papers Shaping CVPR 2025: AI Image & Video Generation Breakthroughs
Code Mala Tang
Code Mala Tang
Jun 4, 2025 · Artificial Intelligence

Flux Kontext: How Open‑Weight AI Image Editing Beats GPT‑Image‑1

Flux Kontext, Black Forest Labs' new open‑weight AI image editing suite, enables fast, low‑cost contextual generation and editing with features such as role consistency, local edits, style transfer, and superior benchmark performance compared to GPT‑Image‑1, Imagen 4, and other leading models.

AI image generationFlux Kontextbenchmark performance
0 likes · 12 min read
Flux Kontext: How Open‑Weight AI Image Editing Beats GPT‑Image‑1
AIWalker
AIWalker
May 29, 2025 · Artificial Intelligence

ImgEdit-Bench Exposes Weak Image Editing Models – A ‘Death Test’ Reveals Who’s Struggling

ImgEdit introduces a large‑scale, high‑quality editing dataset and the ImgEdit‑Bench benchmark, detailing a robust data‑generation pipeline, multi‑round editing tasks, and a specialized evaluation model, and demonstrates through extensive experiments that its ImgEdit‑E1 model outperforms existing open‑source editors and narrows the gap with closed‑source systems.

AIBenchmarkDataset
0 likes · 20 min read
ImgEdit-Bench Exposes Weak Image Editing Models – A ‘Death Test’ Reveals Who’s Struggling
AI Frontier Lectures
AI Frontier Lectures
May 23, 2025 · Artificial Intelligence

How SuperEdit Boosts Instruction-Based Image Editing with Rectified Supervision

SuperEdit introduces rectified instruction generation and contrastive supervision to fix noisy supervision in instruction‑based image editing, achieving up to 9.19% performance gains on Real‑Edit benchmarks without extra model parameters or pre‑training, and releases all data and code publicly.

Visual-Language Modelsdiffusion modelsimage editing
0 likes · 15 min read
How SuperEdit Boosts Instruction-Based Image Editing with Rectified Supervision
Amap Tech
Amap Tech
Apr 21, 2025 · Artificial Intelligence

Lenna: Language‑Enhanced Reasoning Detection Assistant and a Chain‑of‑Thought Image Editing Framework Using Multimodal Large Language Models

At ICASSP 2025, Gaode’s two accepted papers present Lenna, a language‑enhanced reasoning detection assistant that adds a DET token to multimodal LLMs and achieves state‑of‑the‑art accuracy on RefCOCO benchmarks, and a chain‑of‑thought image‑editing framework that converts complex prompts into segmented masks and repair prompts for diffusion‑based inpainting, surpassing existing methods.

AIComputer VisionICASSP
0 likes · 10 min read
Lenna: Language‑Enhanced Reasoning Detection Assistant and a Chain‑of‑Thought Image Editing Framework Using Multimodal Large Language Models
AIWalker
AIWalker
Apr 10, 2025 · Artificial Intelligence

DCEdit: Precise Text-Guided Image Editing that Preserves Backgrounds

DCEdit introduces a precise semantic localization strategy and a dual-level control mechanism for text‑guided image editing, delivering superior background preservation and editing quality, as demonstrated on the new RW‑800 benchmark and extensive comparisons with state‑of‑the‑art diffusion models.

AIBenchmarkdiffusion models
0 likes · 16 min read
DCEdit: Precise Text-Guided Image Editing that Preserves Backgrounds
AIWalker
AIWalker
Mar 23, 2025 · Artificial Intelligence

One-Click Removal & Seamless Integration: CycleFlow + Diffusion Prior Power OmniPaint

OmniPaint introduces a unified diffusion‑based framework that achieves physically consistent object removal and insertion by leveraging a pre‑trained FLUX‑1 diffusion prior, a progressive CycleFlow training pipeline, and a novel reference‑free CFD metric for high‑fidelity image editing.

CFD MetricCycleFlowObject Insertion
0 likes · 17 min read
One-Click Removal & Seamless Integration: CycleFlow + Diffusion Prior Power OmniPaint
Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
Oct 16, 2024 · Artificial Intelligence

How VICTORIA Revolutionizes Multi‑Object Image Editing with Language‑Aware Diffusion

The VICTORIA algorithm, presented by Alibaba Cloud AI Platform PAI and South China University of Technology at ACM MM 2024, leverages linguistic dependency parsing to guide cross‑attention in Stable Diffusion, enabling accurate, training‑free multi‑object image editing while preserving spatial structure and achieving state‑of‑the‑art results on benchmark datasets.

AI researchStable DiffusionVICTORIA
0 likes · 10 min read
How VICTORIA Revolutionizes Multi‑Object Image Editing with Language‑Aware Diffusion
Alibaba Cloud Developer
Alibaba Cloud Developer
Jun 20, 2024 · Artificial Intelligence

Build Your Own AI Image Editing Assistant with Alibaba Cloud PAI‑DSW

This guide walks you through using Alibaba Cloud's PAI‑DSW and the Free Prompt Editing algorithm to set up a personal AI‑generated content (AIGC) drawing assistant, covering environment setup, instance creation, WebUI parameter tuning, example edits, resource cleanup, and how to share your creations for rewards.

AIGCAlibaba CloudPAI-DSW
0 likes · 6 min read
Build Your Own AI Image Editing Assistant with Alibaba Cloud PAI‑DSW
Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
Jun 18, 2024 · Artificial Intelligence

Free-Prompt-Editing: Efficient Text-Guided Image Editing with Stable Diffusion

The paper introduces Free-Prompt-Editing (FPE), a novel, efficient algorithm for text‑guided image editing that leverages probe analysis of cross‑ and self‑attention maps in Stable Diffusion, demonstrates its superiority over existing methods through extensive experiments, and provides open‑source implementation for both synthetic and real‑image editing.

AI researchStable Diffusionattention maps
0 likes · 12 min read
Free-Prompt-Editing: Efficient Text-Guided Image Editing with Stable Diffusion