Tagged articles
2 articles
Page 1 of 1
AIWalker
AIWalker
Aug 13, 2025 · Artificial Intelligence

Look-Back Triggers Visual Reflection in Qwen-2.5-VL, +6.3% Perception

Look-Back is an implicit training paradigm that enables the Qwen‑2.5‑VL‑7B multimodal LLM to autonomously re‑focus on visual inputs during reasoning, achieving a 6.3 % boost in perception tasks, outperforming prior baselines while requiring no extra image tokens or model architecture changes.

Look-BackMultimodal LLMQwen-2.5-VL
0 likes · 26 min read
Look-Back Triggers Visual Reflection in Qwen-2.5-VL, +6.3% Perception
ByteDance Web Infra
ByteDance Web Infra
Feb 25, 2025 · Artificial Intelligence

Midscene.js Integrates Qwen‑2.5‑VL Model: Cost‑Effective, High‑Resolution UI Automation

Midscene.js v0.12 adds support for the Qwen‑2.5‑VL model, delivering GPT‑4o‑level accuracy while cutting token usage and cost by up to 80%, enabling interaction with canvas and iframe elements, offering high‑resolution input, and providing easy configuration through environment variables and a browser plugin.

Artificial IntelligenceMidscene.jsQwen-2.5-VL
0 likes · 10 min read
Midscene.js Integrates Qwen‑2.5‑VL Model: Cost‑Effective, High‑Resolution UI Automation