Tagged articles

LVLM

5 articles · Page 1 of 1

Network Intelligence Research Center (NIRC)

Jan 11, 2026 · Artificial Intelligence

Insights from NeurIPS 2025: Modeling Distributions and Venturing Beyond Them

The report summarizes NeurIPS 2025 in San Diego, highlighting four NIRC papers on noise‑robust 3D human pose estimation, LVLM video‑anomaly understanding, and hand‑object reconstruction, and discusses broader industry trends such as feed‑forward generation and large‑scale pre‑training showcased by leading AI companies.

3D human pose estimationAI researchLVLM

0 likes · 5 min read

Insights from NeurIPS 2025: Modeling Distributions and Venturing Beyond Them

Tencent Advertising Technology

Dec 4, 2025 · Artificial Intelligence

How POPEN Boosts LVLM Reasoning Segmentation with Preference Optimization and Ensemble

The paper introduces POPEN, a new framework that uses preference‑based optimization and ensemble methods to reduce hallucinations and improve segmentation accuracy in large visual language models, achieving state‑of‑the‑art results on multiple benchmarks.

LVLMMultimodal ModelsPreference Optimization

0 likes · 14 min read

How POPEN Boosts LVLM Reasoning Segmentation with Preference Optimization and Ensemble

Data Party THU

Aug 22, 2025 · Artificial Intelligence

TwigVLM: How Tiny Branches Accelerate Large Vision‑Language Models

TwigVLM introduces a lightweight “twig” module that prunes visual tokens early and enables self‑speculative decoding, achieving up to 154% speedup on long‑text generation while preserving 96% of original LVLM accuracy, as demonstrated on LLaVA‑1.5‑7B and other benchmarks.

LVLMSpeculative Decodingmodel acceleration

0 likes · 14 min read

TwigVLM: How Tiny Branches Accelerate Large Vision‑Language Models

Data Party THU

Aug 11, 2025 · Artificial Intelligence

Can Hidden Signals Reveal Multimodal Model Jailbreaks? Introducing HiddenDetect

This article presents HiddenDetect, a training‑free method that leverages refusal‑semantic vectors and layer‑wise activation analysis to detect jailbreak attempts in multimodal large language models, revealing distinct safety signals across text and image modalities and demonstrating strong performance on several LVLM benchmarks.

LVLMactivation analysisjailbreak detection

0 likes · 7 min read

Can Hidden Signals Reveal Multimodal Model Jailbreaks? Introducing HiddenDetect

AIWalker

May 22, 2025 · Artificial Intelligence

VisionReasoner: RL‑Unified System Beats YOLO‑World on Detection, Segmentation, Counting

VisionReasoner introduces a reinforcement‑learning‑driven unified framework that simultaneously handles detection, segmentation, and counting tasks within a single model, achieving 29.1% higher COCO detection AP, 22.1% better ReasonSeg segmentation, and 15.3% improvement on CountBench, while requiring only 7,000 training samples and offering efficient multi‑target matching via batch computation and the Hungarian algorithm.

LVLMObject CountingReinforcement Learning

0 likes · 19 min read

VisionReasoner: RL‑Unified System Beats YOLO‑World on Detection, Segmentation, Counting