AIWalker
Author

AIWalker

Focused on computer vision, image processing, color science, and AI algorithms; sharing hardcore tech, engineering practice, and deep insights as a diligent AI technology practitioner.

153
Articles
0
Likes
0
Views
0
Comments
Recent Articles

Latest from AIWalker

100 recent articles max
AIWalker
AIWalker
Mar 8, 2026 · Artificial Intelligence

How VisionPangu’s 1.7B Model Beats Larger LLMs in Detailed Image Captioning

VisionPangu demonstrates that a compact 1.7 B‑parameter multimodal model can generate richly detailed, coherent image descriptions that rival much larger models by leveraging high‑quality dense data, a three‑part architecture, and a two‑stage deep alignment training strategy.

AI researchData qualityImage Captioning
0 likes · 13 min read
How VisionPangu’s 1.7B Model Beats Larger LLMs in Detailed Image Captioning
AIWalker
AIWalker
Mar 8, 2026 · Artificial Intelligence

FireRed-Image-Edit v1.1 Boosts OOTD Element Fusion and Portrait Consistency

The Super Intelligence team at Xiaohongshu unveils FireRed-Image-Edit v1.1, an open‑source image‑editing model that dramatically improves ID‑consistent edits, multi‑element OOTD fusion, portrait makeup, and font style rendering while delivering end‑to‑end generation in 4.5 seconds on 30 GB VRAM, backed by a full training‑distillation pipeline and a technical report on arXiv.

AI modelFireRed-Image-EditLoRA
0 likes · 10 min read
FireRed-Image-Edit v1.1 Boosts OOTD Element Fusion and Portrait Consistency
AIWalker
AIWalker
Mar 7, 2026 · Artificial Intelligence

YOLO-Master v2026.02 Unveils Four Innovations for SOTA Object Detection

Tencent’s YOLO-Master v2026.02 adds a Mixture‑of‑Experts architecture, zero‑overhead LoRA fine‑tuning, Sparse SAHI inference for large images, and Cluster‑Weighted NMS, delivering 3‑5× faster inference, up to 70% reduced training resources, and markedly higher detection accuracy across diverse benchmarks.

LoRAMixture of ExpertsModel Optimization
0 likes · 15 min read
YOLO-Master v2026.02 Unveils Four Innovations for SOTA Object Detection
AIWalker
AIWalker
Mar 6, 2026 · Artificial Intelligence

VA‑π: Pixel‑Level Alignment Achieves 50% FID Reduction with 25‑Minute Fine‑Tuning

The paper introduces VA‑π, a lightweight post‑training framework that aligns pixel‑level reconstruction with autoregressive generation using variational inference and reinforcement learning, achieving up to 50% FID reduction after just 25 minutes of fine‑tuning on LlamaGen‑XXL.

AR ModelsPixel AlignmentVariational Inference
0 likes · 14 min read
VA‑π: Pixel‑Level Alignment Achieves 50% FID Reduction with 25‑Minute Fine‑Tuning
AIWalker
AIWalker
Mar 5, 2026 · Artificial Intelligence

How ViDA-UGC Leverages Large Multimodal Models for Fine-Grained Visual Quality Assessment

The article introduces ViDA-UGC, a large‑scale UGC visual‑quality dataset and its companion benchmark ViDA‑Bench, explains the MILP‑driven sampling, expert annotation pipeline, and CoT‑based evaluation framework, and shows how fine‑tuning popular multimodal LLMs on this data markedly improves low‑level quality perception, grounding, and description capabilities.

benchmarkchain of thoughtdataset
0 likes · 12 min read
How ViDA-UGC Leverages Large Multimodal Models for Fine-Grained Visual Quality Assessment
AIWalker
AIWalker
Mar 4, 2026 · Artificial Intelligence

Drifting Models Enable One‑Step Generation, Shattering Speed Records

The paper introduces Drifting Models, a new generative paradigm that moves the distribution evolution to the training phase, achieving true one‑step (1‑NFE) generation with state‑of‑the‑art ImageNet FID scores of 1.54 in latent space and 1.61 in pixel space, while eliminating the need for distillation or classifier‑free guidance.

Drifting ModelsImageNetOne-step Generation
0 likes · 24 min read
Drifting Models Enable One‑Step Generation, Shattering Speed Records
AIWalker
AIWalker
Mar 3, 2026 · Artificial Intelligence

How NanoSD Cuts 90% Parameters to Enable Real‑Time Photo Editing on Mobile

NanoSD distills Stable Diffusion 1.5 into a 130 M‑parameter model that runs inference in 20 ms on a Qualcomm SM8750 NPU, using hardware‑aware module pruning, module‑level knowledge distillation, and Bayesian optimization to achieve Pareto‑optimal quality‑efficiency trade‑offs for on‑device image restoration.

Stable Diffusionbayesian optimizationknowledge distillation
0 likes · 14 min read
How NanoSD Cuts 90% Parameters to Enable Real‑Time Photo Editing on Mobile
AIWalker
AIWalker
Mar 3, 2026 · Artificial Intelligence

RetouchIQ’s Instruction‑Driven AI Editing Overcomes Traditional Retouching Limits

RetouchIQ introduces an instruction‑driven AI retouching system that uses a general reward model to interpret abstract user commands, delivering precise image adjustments with higher semantic consistency and visual naturalness than existing multimodal large language models, thereby lowering the technical barrier for cinematic‑style edits.

AI Image EditingRetouchIQReward Model
0 likes · 3 min read
RetouchIQ’s Instruction‑Driven AI Editing Overcomes Traditional Retouching Limits
AIWalker
AIWalker
Mar 1, 2026 · Artificial Intelligence

How X2HDR Enables AI to Achieve True Transparent HDR Imaging

X2HDR tackles the long‑standing HDR generation problem by converting color data into a perceptual uniform space and applying LoRA lightweight fine‑tuning, dramatically boosting visual fidelity while slashing data and compute demands for film, gaming, and VR.

AIHDR imagingLoRA
0 likes · 3 min read
How X2HDR Enables AI to Achieve True Transparent HDR Imaging
AIWalker
AIWalker
Feb 27, 2026 · Artificial Intelligence

YOLO26 Review: End-to-End, NMS‑Free Edge AI Boosts CPU Inference by 43%

This article analyzes YOLO26’s architecture redesign that eliminates NMS, removes DFL, introduces progressive loss balancing, STAL, and the MuSGD optimizer, achieving up to 43% faster CPU inference and simplifying deployment for edge vision tasks across detection, segmentation, classification, pose estimation, and OBB.

CPU inferenceNMS-freeYOLO26
0 likes · 13 min read
YOLO26 Review: End-to-End, NMS‑Free Edge AI Boosts CPU Inference by 43%