Can Diffusion Models Turn Noisy GPS into Sub‑Meter Visual Localization?
The DiffVL framework redefines visual localization as a diffusion‑based GPS denoising task, using BEV‑conditioned visual cues and standard SD maps to achieve sub‑meter accuracy without high‑definition maps, and demonstrates its superiority through extensive autonomous‑driving experiments.
