Xiaomi Auto Team Wins ICCV 2025 RealADSim Challenge with 3D Gaussian Splatting

The Xiaomi automotive research team secured first place in the ICCV 2025 RealADSim New View Synthesis challenge by presenting a four‑stage, 3D Gaussian Splatting‑based pipeline that tackles LiDAR‑free initialization, structured road modeling, generative pseudo‑ground‑truth refinement, and time‑invariant adaptation, achieving top PSNR, SSIM, and LPIPS scores.

Xiaomi Tech
Xiaomi Tech
Xiaomi Tech
Xiaomi Auto Team Wins ICCV 2025 RealADSim Challenge with 3D Gaussian Splatting

Recently, the Xiaomi automotive team won the RealADSim challenge—New View Synthesis track—at ICCV 2025, an A‑class computer‑vision conference, and presented their results at the RealADSim workshop.

Background and Motivation

RealADSim focuses on cutting‑edge problems such as autonomous driving simulation, visual navigation, and high‑fidelity scene synthesis. Validation of autonomous‑driving algorithms faces high cost and safety risks, making realistic simulation essential. Traditional game‑engine simulators suffer from noticeable visual‑style gaps between synthetic scenes and real road conditions, and building such scenes requires extensive manual effort. Pure video capture provides realistic visuals but cannot support closed‑loop interaction.

New View Synthesis (NVS) Challenge

NVS aims to create a 4D interactive driving simulation from real‑world video data. While current NVS methods render plausible interpolated views, they struggle with extrapolated viewpoints, leading to geometric inconsistencies and unrealistic appearances—issues critical for closed‑loop simulation and reinforcement learning.

Dataset and Evaluation

The competition constructed a dataset from multiple real driving logs, defining quantitative metrics for extrapolated view synthesis. Test trajectories covered three challenging extrapolation types: single‑lane shift, double‑lane shift, and vertical lane movement at intersections. Among 19 valid teams, Xiaomi achieved PSNR 18.228, SSIM 0.514, and LPIPS 0.288, earning the championship.

Four‑Stage Solution Architecture

The pipeline, built on 3D Gaussian Splatting (3DGS), targets geometric consistency and visual realism without LiDAR input. It consists of:

Unsupervised point‑cloud initialization (LiDAR‑free): generates a pseudo‑point cloud from visual data to avoid local minima.

Structured 3D scene modeling: incorporates a 2D‑SDF road surface model to embed strong geometric priors.

Generative extrapolation repair (Pseudo Ground Truth Optimization): introduces pseudo‑ground‑truth supervision to improve plausibility of extrapolated views.

Time‑invariance adaptation: learns to remove time‑dependent textures (lighting, shadows, clouds) for robust cross‑temporal generalization.

Key Challenges and Solutions

1. LiDAR‑free Model Initialization

Standard sparse SfM point clouds and random initialization lead to unrealistic geometry in extrapolated views because 3DGS lacks a good starting position and most data are front‑view only. The team adopted VGGT to generate a visual‑derived pseudo‑point cloud, which significantly improves detail and prevents unreasonable geometry during extrapolation.

2. Road Surface Modeling for Geometric Consistency

Road surfaces experience large viewpoint changes in extrapolation, and texture scarcity makes direct 3DGS or NeRF representations prone to distortion. The authors introduced a 2D‑SDF representation that treats the road as a locally planar surface with smooth global slope variations, converting a 3D SDF field into two 2D fields. This constraint yields better learning efficiency and preserves geometry in extrapolated views, outperforming pure 3DGS.

3. Generative Repair for Non‑Road Objects

Objects such as curbs and vegetation lack fixed geometric priors, causing severe distortion and floating artifacts in extrapolated frames. Leveraging prior work (DriveDreamer4D, StreetCrafter, ReconDreamer), the team employed a generative repair model (Difix3D+) to synthesize pseudo‑ground‑truth along extrapolated trajectories. This adds supervision for LPIPS loss while mitigating high‑frequency texture artifacts that would otherwise degrade PSNR and SSIM.

4. Adapting to Varying Time and Lighting Conditions

The dataset includes recordings from different times of day, leading to lighting and scene‑detail variations. The solution trains a neural network on multiple driving logs to learn rule‑based adaptations, using one trajectory to reconstruct 3DGS, then rendering another trajectory’s camera parameters, and supervising the adaptation network with the real data loss.

Experiments and Ablation Study

Results (top‑5 leaderboard excerpt) show the impact of each stage:

Experiment 2: Adding 2D‑SDF improves LPIPS while maintaining PSNR and SSIM.

Experiment 3: Pseudo‑LiDAR initialization further boosts all metrics and enhances detail recovery for near objects.

Experiment 4: Generative pseudo‑ground‑truth supervision yields a large LPIPS gain and removes floating artifacts.

Experiment 5: The Time‑Invariance Adaptation network achieves optimal performance across spatio‑temporal extrapolation.

Visualization

Qualitative visualizations (included images) demonstrate the method’s ability to render realistic extrapolated views with consistent road geometry and reduced artifacts.

The presented solution provides a high‑fidelity, interactive autonomous‑driving simulation pipeline, confirming the potential of pure‑vision approaches to lower validation costs and safety risks while advancing the reliability of autonomous‑driving technologies.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

ICCV 20253D Gaussian SplattingAutonomous Driving SimulationNew View SynthesisRealADSimVision-based Simulation
Xiaomi Tech
Written by

Xiaomi Tech

Chat about technology with Xiaomi and change life together.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.