Tagged articles

open-vocabulary

5 articles · Page 1 of 1

May 15, 2026 · Artificial Intelligence

FreeOcc: The First Training‑Free Open‑Vocabulary 3D Occupancy Mapping System (RSS‑2026)

FreeOcc introduces a training‑free, open‑vocabulary 3D occupancy prediction framework that combines SLAM‑based pose estimation, 3D Gaussian Splatting, and pretrained vision‑language models to build globally consistent semantic maps, achieving over‑two‑fold IoU improvements on EmbodiedOcc‑ScanNet and strong zero‑shot generalization on the new ReplicaOcc benchmark.

3D GaussianFreeOccSLAM

0 likes · 19 min read

FreeOcc: The First Training‑Free Open‑Vocabulary 3D Occupancy Mapping System (RSS‑2026)

Machine Heart

May 5, 2026 · Artificial Intelligence

Monocular Open‑Vocabulary Occupancy Prediction Sets New SOTA for Indoor 3D Scenes (CVPR 2026 Oral)

The paper introduces LegoOcc, a monocular open‑vocabulary occupancy framework that unifies geometry and semantics via language‑embedded Gaussians, uses Poisson‑based aggregation and progressive temperature decay, and achieves over twice the previous mIoU on Occ‑ScanNet while running at 22.47 FPS, making it well suited for embodied robots.

3D VisionCVPR 2026Monocular

0 likes · 12 min read

Monocular Open‑Vocabulary Occupancy Prediction Sets New SOTA for Indoor 3D Scenes (CVPR 2026 Oral)

AIWalker

May 18, 2025 · Artificial Intelligence

YOLOE: Open‑Source Real‑Time Anything Detector Beats YOLO‑World v2

YOLOE unifies object detection and segmentation in a single efficient model that supports text, visual, and prompt‑free inference, introduces RepRTA, SAVPE, and LRPC strategies, and achieves higher AP with up to three‑fold lower training cost and 1.4× faster inference on GPUs and mobile devices, as demonstrated by extensive LVIS and COCO experiments.

Prompt engineeringReal-timeYOLOE

0 likes · 29 min read

YOLOE: Open‑Source Real‑Time Anything Detector Beats YOLO‑World v2

Huolala Tech

Jan 25, 2024 · Artificial Intelligence

How Open‑Vocabulary Detection and Segment‑Anything Are Revolutionizing Visual AI at Huolala

This article reviews traditional computer‑vision tasks—classification, detection, and segmentation—highlights their limitations, introduces open‑vocabulary detection and segment‑anything models such as GLIP, Grounding DINO, and SAM, and details how Huolala applies these advances to driver‑license, packing, and vehicle‑sticker inspections for safer, more efficient AI‑driven operations.

Segmentationcomputer visionobject detection

0 likes · 20 min read

How Open‑Vocabulary Detection and Segment‑Anything Are Revolutionizing Visual AI at Huolala

Xiaohongshu Tech REDtech

Jun 20, 2023 · Artificial Intelligence

Open-Vocabulary Object Attribute Recognition with OvarNet: A Unified Framework for Detection and Attribute Classification

At CVPR 2023 the Xiaohongshu team presented OvarNet, a unified one‑stage Faster‑RCNN model built on CLIP that uses prompt learning and knowledge distillation to jointly detect objects and recognize open‑vocabulary attributes, achieving state‑of‑the‑art results on VAW, MS‑COCO, LSA and OVAD datasets.

Knowledge DistillationMultimodal Learningattribute recognition