Author

AIWalker

Focused on computer vision, image processing, color science, and AI algorithms; sharing hardcore tech, engineering practice, and deep insights as a diligent AI technology practitioner.

163

Articles

Likes

233

Views

Comments

Latest from AIWalker

100 recent articles max

AIWalker

Apr 7, 2025 · Artificial Intelligence

Is CLIP Obsolete? LeCun and Xie's New Multimodal Model Beats Language Supervision

A recent study by LeCun, Xie, and collaborators shows that large‑scale visual self‑supervised learning (Web‑SSL) can match or surpass CLIP on diverse VQA tasks, even without any language supervision, by scaling model size and data volume.

CLIPModel ScalingVQA

0 likes · 13 min read

Is CLIP Obsolete? LeCun and Xie's New Multimodal Model Beats Language Supervision

AIWalker

Apr 7, 2025 · Artificial Intelligence

TurboFill: High‑Quality Image Inpainting in Just 4 Steps

TurboFill introduces a fast image‑inpainting model that trains a repair adapter on a few‑step text‑to‑image diffusion backbone, achieving state‑of‑the‑art results with only four diffusion steps while dramatically reducing computational cost.

TurboFillcomputer visiondiffusion models

0 likes · 17 min read

TurboFill: High‑Quality Image Inpainting in Just 4 Steps

AIWalker

Apr 6, 2025 · Artificial Intelligence

NOVA: Redefining Autoregressive Visual Modeling Without Vector Quantization

NOVA introduces a highly efficient autoregressive video generation framework that eliminates vector quantization, combines frame‑by‑frame causal prediction with set‑by‑set spatial attention, and achieves state‑of‑the‑art quality on VBench and GenEval while offering strong zero‑shot generalization across text‑to‑image and text‑to‑video tasks.

Benchmark ResultsNovaautoregressive video generation

0 likes · 14 min read

NOVA: Redefining Autoregressive Visual Modeling Without Vector Quantization

AIWalker

Apr 2, 2025 · Artificial Intelligence

EasyControl: Plug‑and‑Play DiT Control with Arbitrary Aspect Ratios and Accelerated Inference

EasyControl introduces a lightweight condition‑injection LoRA module, a position‑aware training paradigm, and causal attention with KV‑cache to enable plug‑and‑play multi‑condition control for DiT models, supporting arbitrary image resolutions while cutting inference latency by up to 30% and preserving high‑quality generation.

Conditional GenerationDiTEasyControl

0 likes · 17 min read

EasyControl: Plug‑and‑Play DiT Control with Arbitrary Aspect Ratios and Accelerated Inference

AIWalker

Mar 31, 2025 · Artificial Intelligence

VBench-2.0: A Next‑Generation Benchmark for Intrinsic Faithfulness in AI Video Generation

VBench-2.0 expands the original VBench suite by introducing six fine‑grained dimensions—Human Fidelity, Controllability, Creativity, Physics, Commonsense, and more—to evaluate not only the visual quality of generated videos but also their intrinsic faithfulness to physical laws, common sense, and narrative coherence, providing open‑source tools, prompts, and human‑aligned metrics for the research community.

AI evaluationIntrinsic FaithfulnessVBench

0 likes · 12 min read

VBench-2.0: A Next‑Generation Benchmark for Intrinsic Faithfulness in AI Video Generation

AIWalker

Mar 27, 2025 · Artificial Intelligence

MagicColor: First Multi‑Instance AI Sketch‑Coloring System for Professional‑Grade Comics

MagicColor introduces a novel multi‑instance sketch‑coloring framework that uses a two‑stage self‑play training strategy, instance guidance, and edge‑aware pixel‑level color matching to automatically produce high‑quality, consistent colors for multiple line‑art instances, outperforming prior GAN and diffusion‑based methods.

AIMulti-InstanceSketch Colorization

0 likes · 16 min read

MagicColor: First Multi‑Instance AI Sketch‑Coloring System for Professional‑Grade Comics

AIWalker

Mar 25, 2025 · Artificial Intelligence

ContinuousSR: Reconstructing Continuous High-Resolution Signals from Discrete Low-Resolution Images

ContinuousSR introduces a Pixel-to-Gaussian paradigm that models images as continuous Gaussian fields, enabling arbitrary‑scale super‑resolution with 0.9 dB PSNR gains and up to 19.5× faster rendering compared to existing methods.

Arbitrary-Scale SRContinuousSRPixel-to-Gaussian

0 likes · 5 min read

ContinuousSR: Reconstructing Continuous High-Resolution Signals from Discrete Low-Resolution Images

AIWalker

Mar 23, 2025 · Artificial Intelligence

One-Click Removal & Seamless Integration: CycleFlow + Diffusion Prior Power OmniPaint

OmniPaint introduces a unified diffusion‑based framework that achieves physically consistent object removal and insertion by leveraging a pre‑trained FLUX‑1 diffusion prior, a progressive CycleFlow training pipeline, and a novel reference‑free CFD metric for high‑fidelity image editing.

CFD MetricCycleFlowImage editing

0 likes · 17 min read

One-Click Removal & Seamless Integration: CycleFlow + Diffusion Prior Power OmniPaint

AIWalker

Mar 19, 2025 · Artificial Intelligence

How a Trainable HVI Color Space Turns Dark Photos into Cinematic Images

The paper introduces HVI, the first trainable color space for low‑light image enhancement, and a lightweight dual‑branch network CIDNet that jointly models intensity and chromaticity, eliminating color bias and brightness artifacts, achieving state‑of‑the‑art results on ten benchmark datasets with only 1.88 M parameters and 7.57 GFLOPs.

0 likes · 13 min read

How a Trainable HVI Color Space Turns Dark Photos into Cinematic Images

AIWalker

Mar 18, 2025 · Artificial Intelligence

How ImageRAG Boosts Text‑to‑Image Generation with Retrieval‑Augmented Generation

ImageRAG introduces a retrieval‑augmented generation framework that dynamically fetches relevant images to guide diffusion models, dramatically improving the synthesis of rare and fine‑grained concepts across multiple text‑to‑image systems, as demonstrated by extensive quantitative and user studies.

AI generationImageRAGRetrieval Augmented Generation

0 likes · 17 min read

How ImageRAG Boosts Text‑to‑Image Generation with Retrieval‑Augmented Generation