DataFunTalk
DataFunTalk
Sep 29, 2025 · Artificial Intelligence

How Glint-MVT Powers City‑Scale Multimodal AI: Insights from a Tech VP

In an interview before the DACon conference, Dr. Feng Ziyong reveals how Glint‑MVT and novel data‑synthesis techniques overcome distribution gaps, improve compositional understanding, and enable billion‑scale, second‑level retrieval for city‑level surveillance, while balancing model efficiency and effectiveness.

city surveillancedata synthesisembedding retrieval
0 likes · 11 min read
How Glint-MVT Powers City‑Scale Multimodal AI: Insights from a Tech VP
AI Algorithm Path
AI Algorithm Path
Aug 16, 2025 · Artificial Intelligence

Meta Unveils DINOv3: A Universal Self‑Supervised Visual AI for All Image Tasks

Meta's DINOv3 is a 70‑billion‑parameter self‑supervised visual foundation model trained on 17 billion Instagram images without any labels, introducing dense feature extraction, Gram‑Anchoring to prevent feature collapse, high‑resolution adaptation, and multi‑student distillation that together enable out‑of‑the‑box performance on segmentation, depth estimation, 3D matching, and tracking while surpassing prior models such as DINOv2, CLIP, and SAM.

DINOv3Gram AnchoringLarge-Scale Training
0 likes · 8 min read
Meta Unveils DINOv3: A Universal Self‑Supervised Visual AI for All Image Tasks
AIWalker
AIWalker
May 12, 2025 · Artificial Intelligence

DefMamba: A Deformable Multi‑Scale Visual Foundation Model that Boosts Vision Tasks

DefMamba introduces a multi‑scale backbone, deformable Mamba modules, and a dynamic scanning strategy to preserve image spatial structure, achieving state‑of‑the‑art performance on image classification, object detection, and semantic segmentation benchmarks.

DefMambacomputer visiondeformable state space
0 likes · 23 min read
DefMamba: A Deformable Multi‑Scale Visual Foundation Model that Boosts Vision Tasks