AIWalker
Author

AIWalker

Focused on computer vision, image processing, color science, and AI algorithms; sharing hardcore tech, engineering practice, and deep insights as a diligent AI technology practitioner.

153
Articles
0
Likes
0
Views
0
Comments
Recent Articles

Latest from AIWalker

100 recent articles max
AIWalker
AIWalker
Mar 17, 2025 · Artificial Intelligence

How UNIFIEDREWARD Breaks Task Boundaries to Boost Image and Video Performance

The paper introduces UNIFIEDREWARD, the first unified reward model for multimodal understanding and generation that supports pairwise ranking and pointwise scoring, builds a 236K human‑preference dataset across image and video tasks, and uses DPO to align VLMs and diffusion models, achieving significant performance gains on both image and video benchmarks.

Direct Preference OptimizationImage GenerationPreference Modeling
0 likes · 19 min read
How UNIFIEDREWARD Breaks Task Boundaries to Boost Image and Video Performance
AIWalker
AIWalker
Mar 16, 2025 · Industry Insights

Understanding HDR Vivid: Key Features, Standards, and Industry Adoption

HDR Vivid, a Chinese HDR standard introduced by the CUVA alliance, expands dynamic range, color gamut, and bit depth with dynamic metadata, tone mapping, and saturation control, and is compared against HDR10, HDR10+, Dolby Vision, and HLG while detailing its technical advantages and current industry adoption.

Dynamic MetadataHDRVideo Standards
0 likes · 10 min read
Understanding HDR Vivid: Key Features, Standards, and Industry Adoption
AIWalker
AIWalker
Mar 16, 2025 · Artificial Intelligence

VideoPainter: Plug‑and‑Play Video Inpainting and Editing Sets 8 SOTA Benchmarks

VideoPainter introduces a plug‑and‑play dual‑branch framework for video inpainting and editing, featuring a lightweight context encoder, ID‑consistent resampling, and the large VPData/VPBench datasets, and achieves state‑of‑the‑art results across eight quantitative and qualitative metrics.

Diffusion ModelsDual-Branch ArchitectureID resampling
0 likes · 15 min read
VideoPainter: Plug‑and‑Play Video Inpainting and Editing Sets 8 SOTA Benchmarks
AIWalker
AIWalker
Mar 15, 2025 · Artificial Intelligence

How SANA 1.5 Lets Small Models Reach New Text‑to‑Image SOTA

SANA 1.5 introduces an efficient model‑growth pipeline, depth‑pruning, and inference‑time scaling that reuse a 1.6 B‑parameter foundation to train a 4.8 B model with 8× lower memory, 60 % less training time, and GenEval scores that rival or surpass much larger diffusion models.

diffusionefficient traininginference scaling
0 likes · 17 min read
How SANA 1.5 Lets Small Models Reach New Text‑to‑Image SOTA
AIWalker
AIWalker
Mar 14, 2025 · Artificial Intelligence

Dynamic Tanh Lets He Kaiming and LeCun Drop Transformer Normalization in 9 Lines

Researchers He Kaiming, Yann LeCun and colleagues propose a 9‑line Dynamic Tanh (DyT) layer that replaces LayerNorm/RMSNorm in Transformers, showing comparable or superior accuracy across vision, language, speech and DNA tasks while also reducing inference latency on modern GPUs.

AI researchDynamic TanhModel Efficiency
0 likes · 18 min read
Dynamic Tanh Lets He Kaiming and LeCun Drop Transformer Normalization in 9 Lines
AIWalker
AIWalker
Mar 13, 2025 · Artificial Intelligence

VideoPainter: Plug‑and‑Play Video Inpainting and Editing Achieves 8 SOTA Benchmarks

VideoPainter introduces a plug‑and‑play dual‑branch framework with a lightweight context encoder and ID‑resampling adapter, built on the massive VPData/VPBench dataset, and demonstrates state‑of‑the‑art performance across eight video restoration and editing metrics, while supporting flexible model integration and long‑video consistency.

Dual-Branch ArchitectureID ConsistencyPlug-and-Play
0 likes · 18 min read
VideoPainter: Plug‑and‑Play Video Inpainting and Editing Achieves 8 SOTA Benchmarks
AIWalker
AIWalker
Mar 13, 2025 · Artificial Intelligence

YOLOE: Real‑Time Open‑World Object Detection and Segmentation Unveiled

The paper introduces YOLOE, a new YOLO‑based model that supports text, visual, and no‑prompt open‑world detection and segmentation, detailing its lightweight RepRTA, SAVPE, and LRPC modules and showing benchmark gains in speed and zero‑shot performance on LVIS and COCO.

YOLOEbenchmarkcomputer vision
0 likes · 9 min read
YOLOE: Real‑Time Open‑World Object Detection and Segmentation Unveiled
AIWalker
AIWalker
Mar 11, 2025 · Artificial Intelligence

MobileMamba: Lightweight Multi‑Receptive‑Field Backbone Beats Existing Mamba Models

MobileMamba introduces a three‑stage, lightweight backbone with a multi‑receptive‑field feature‑interaction module that combines wavelet‑enhanced Mamba, multi‑kernel depthwise convolutions, and redundant‑mapping reduction, delivering up to 83.6% ImageNet Top‑1 accuracy while running 21× faster than LocalVim and 3.3× faster than EfficientVMamba.

CNNMambaMobileMamba
0 likes · 10 min read
MobileMamba: Lightweight Multi‑Receptive‑Field Backbone Beats Existing Mamba Models
AIWalker
AIWalker
Mar 11, 2025 · Artificial Intelligence

Introducing FAR: A Frequency‑Progressive Autoregressive Paradigm for Image Generation

The paper presents FAR, a frequency‑aware autoregressive framework that predicts image tokens from low‑frequency to high‑frequency components using a continuous tokenizer, and demonstrates its efficiency and quality on ImageNet and text‑to‑image benchmarks compared with existing AR and VAR methods.

AI researchFARImage Generation
0 likes · 20 min read
Introducing FAR: A Frequency‑Progressive Autoregressive Paradigm for Image Generation
AIWalker
AIWalker
Mar 10, 2025 · Artificial Intelligence

HSR-Mamba Solves Mamba’s HSISR Issue with Dual Strategies, Beats Prior Methods

HSR-Mamba introduces a contextual spatial‑spectral state‑space model that tackles Mamba's limitations in hyperspectral image super‑resolution through a local partition mechanism and a global spectral rearrangement strategy, achieving significantly higher PSNR, SSIM and SAM scores than existing approaches while using fewer parameters and FLOPs.

Dual strategyHSI super-resolutionMamba
0 likes · 25 min read
HSR-Mamba Solves Mamba’s HSISR Issue with Dual Strategies, Beats Prior Methods