Tagged articles
5 articles
Page 1 of 1
Machine Heart
Machine Heart
May 15, 2026 · Artificial Intelligence

FreeOcc: The First Training‑Free Open‑Vocabulary 3D Occupancy Mapping System (RSS‑2026)

FreeOcc introduces a training‑free, open‑vocabulary 3D occupancy prediction framework that combines SLAM‑based pose estimation, 3D Gaussian Splatting, and pretrained vision‑language models to build globally consistent semantic maps, achieving over‑two‑fold IoU improvements on EmbodiedOcc‑ScanNet and strong zero‑shot generalization on the new ReplicaOcc benchmark.

3D GaussianFreeOccSLAM
0 likes · 19 min read
FreeOcc: The First Training‑Free Open‑Vocabulary 3D Occupancy Mapping System (RSS‑2026)
Machine Heart
Machine Heart
May 5, 2026 · Artificial Intelligence

Monocular Open‑Vocabulary Occupancy Prediction Sets New SOTA for Indoor 3D Scenes (CVPR 2026 Oral)

The paper introduces LegoOcc, a monocular open‑vocabulary occupancy framework that unifies geometry and semantics via language‑embedded Gaussians, uses Poisson‑based aggregation and progressive temperature decay, and achieves over twice the previous mIoU on Occ‑ScanNet while running at 22.47 FPS, making it well suited for embodied robots.

3D visionCVPR 2026Monocular
0 likes · 12 min read
Monocular Open‑Vocabulary Occupancy Prediction Sets New SOTA for Indoor 3D Scenes (CVPR 2026 Oral)
AIWalker
AIWalker
May 18, 2025 · Artificial Intelligence

YOLOE: Open‑Source Real‑Time Anything Detector Beats YOLO‑World v2

YOLOE unifies object detection and segmentation in a single efficient model that supports text, visual, and prompt‑free inference, introduces RepRTA, SAVPE, and LRPC strategies, and achieves higher AP with up to three‑fold lower training cost and 1.4× faster inference on GPUs and mobile devices, as demonstrated by extensive LVIS and COCO experiments.

Computer VisionPrompt engineeringReal-Time
0 likes · 29 min read
YOLOE: Open‑Source Real‑Time Anything Detector Beats YOLO‑World v2
Huolala Tech
Huolala Tech
Jan 25, 2024 · Artificial Intelligence

How Open‑Vocabulary Detection and Segment‑Anything Are Revolutionizing Visual AI at Huolala

This article reviews traditional computer‑vision tasks—classification, detection, and segmentation—highlights their limitations, introduces open‑vocabulary detection and segment‑anything models such as GLIP, Grounding DINO, and SAM, and details how Huolala applies these advances to driver‑license, packing, and vehicle‑sticker inspections for safer, more efficient AI‑driven operations.

Computer VisionSegmentationobject detection
0 likes · 20 min read
How Open‑Vocabulary Detection and Segment‑Anything Are Revolutionizing Visual AI at Huolala
Xiaohongshu Tech REDtech
Xiaohongshu Tech REDtech
Jun 20, 2023 · Artificial Intelligence

Open-Vocabulary Object Attribute Recognition with OvarNet: A Unified Framework for Detection and Attribute Classification

At CVPR 2023 the Xiaohongshu team presented OvarNet, a unified one‑stage Faster‑RCNN model built on CLIP that uses prompt learning and knowledge distillation to jointly detect objects and recognize open‑vocabulary attributes, achieving state‑of‑the‑art results on VAW, MS‑COCO, LSA and OVAD datasets.

Computer VisionMultimodal Learningattribute recognition
0 likes · 12 min read
Open-Vocabulary Object Attribute Recognition with OvarNet: A Unified Framework for Detection and Attribute Classification