Tagged articles

3D reconstruction

33 articles · Page 1 of 1

Jun 23, 2026 · Artificial Intelligence

How Three CVPR 2026 Performance‑Boosting Techniques Break Visual Task Bottlenecks

This article reviews three CVPR 2026 papers—AVGGT, MVP, and Online3R—detailing how re‑engineered global attention, multi‑view prediction, and online self‑supervised learning each dramatically improve efficiency, stability, or consistency of visual tasks such as multi‑view 3D reconstruction and GUI grounding.

3D reconstructionCVPR 2026GUI grounding

0 likes · 8 min read

How Three CVPR 2026 Performance‑Boosting Techniques Break Visual Task Bottlenecks

Machine Heart

Jun 20, 2026 · Artificial Intelligence

CameraSquad: Precise Camera Control and Multi‑View Consistency for Spatially Intelligent Video Models

CameraSquad introduces a parallel multi‑trajectory video generation framework that delivers precise camera control and cross‑view content consistency, enabling high‑quality 3D point‑cloud reconstruction and superior performance on benchmarks such as WebVid and HumanVid compared with prior camera‑controlled video methods.

3D reconstructionCameraSquadcamera-controlled video

0 likes · 14 min read

CameraSquad: Precise Camera Control and Multi‑View Consistency for Spatially Intelligent Video Models

Machine Heart

Jun 18, 2026 · Artificial Intelligence

Automating 3D Spatial Data: Holi‑Spatial’s 4M‑Scale Multimodal Dataset (ICML 2026 Oral)

Holi‑Spatial introduces a fully automatic pipeline that transforms raw video streams into high‑quality 3D geometry, depth, masks, 3D boxes, instance descriptions, grounding and spatial QA, producing the 4‑million‑item Holi‑Spatial‑4M dataset and substantially improving VLM spatial reasoning performance.

3D reconstructionICML 2026Large-Scale Data

0 likes · 14 min read

Automating 3D Spatial Data: Holi‑Spatial’s 4M‑Scale Multimodal Dataset (ICML 2026 Oral)

Huolala Tech

Jun 3, 2026 · Artificial Intelligence

Three Breakthroughs Driving the Rapid Rise of Computer Vision

The article reviews three major recent breakthroughs in computer vision—self‑supervised visual foundation models, feed‑forward 3D reconstruction, and unified multimodal models—detailing their underlying methods, key papers, performance characteristics, and practical implications for real‑world AI applications.

3D reconstructioncomputer visionmultimodal models

0 likes · 22 min read

Three Breakthroughs Driving the Rapid Rise of Computer Vision

Xiaomi Tech

May 26, 2026 · Artificial Intelligence

Xiaomi Auto Unveils Integrated Reconstruction‑Generation World Model Framework Achieving SOTA on Major Benchmarks

Xiaomi Auto introduces a novel world‑model framework that tightly couples 3D reconstruction and generative prediction, delivering state‑of‑the‑art performance on Waymo and nuScenes benchmarks while enabling high‑fidelity, long‑duration video synthesis for autonomous‑driving scenarios.

3D reconstructionBenchmark SOTAXiaomi Auto

0 likes · 10 min read

Xiaomi Auto Unveils Integrated Reconstruction‑Generation World Model Framework Achieving SOTA on Major Benchmarks

Machine Heart

May 14, 2026 · Artificial Intelligence

Breaking the 3D Perception Bottleneck: VGGT Series Enables Dynamic High‑Fidelity Reconstruction

The VGGT series from KOKONI 3D and collaborators tackles three core 3D perception limits—unbounded sequence memory, dynamic‑static entanglement, and compute‑precision trade‑offs—by introducing StreamCacheVGGT, progressive decoupling, and HD‑VGGT, achieving O(1) memory streaming, 15%+ accuracy gains on dynamic benchmarks, and record‑high AUC on RealEstate10K.

3D reconstructionVGGTcomputer vision

0 likes · 10 min read

Breaking the 3D Perception Bottleneck: VGGT Series Enables Dynamic High‑Fidelity Reconstruction

Machine Heart

May 6, 2026 · Artificial Intelligence

Scal3R Enables Stable Kilometer-Scale 3D Reconstruction of Long Videos

Scal3R introduces test‑time training with a global‑context memory and synchronization mechanism that lets models train on and infer over ultra‑long video sequences, achieving accurate camera poses and dense point clouds for kilometer‑scale scenes while outperforming prior SLAM, SfM and streaming baselines on multiple benchmarks.

3D reconstructionScal3RTest-Time Training

0 likes · 11 min read

Scal3R Enables Stable Kilometer-Scale 3D Reconstruction of Long Videos

Machine Heart

Apr 26, 2026 · Industry Insights

Is 3D Reconstruction the Spatial Foundation for Next‑Gen Models?

The article examines how 3D reconstruction is evolving from offline, single‑scene pipelines to continuous, streaming workflows that feed web distribution, robot simulation, visual positioning, spatial editing, and world‑generation systems, highlighting recent research, standards, and industry deployments.

3D reconstructionDigital Twinrobotics

0 likes · 10 min read

Is 3D Reconstruction the Spatial Foundation for Next‑Gen Models?

Machine Heart

Apr 15, 2026 · Artificial Intelligence

From Clip Generation to Long‑Video Roaming: OmniRoam Enables Stable, Trajectory‑Controlled Video Synthesis

OmniRoam introduces a panoramic, coarse‑to‑fine framework that generates long, trajectory‑controlled videos with higher spatial consistency and temporal coherence, offering a stable and controllable alternative to short‑clip generation and supporting real‑time preview, high‑resolution refinement, and 3D reconstruction applications.

3D reconstructionGenerative AIOmniRoam

0 likes · 8 min read

From Clip Generation to Long‑Video Roaming: OmniRoam Enables Stable, Trajectory‑Controlled Video Synthesis

Machine Heart

Apr 2, 2026 · Artificial Intelligence

HSImul3R: Bridging Perception and Simulation for Physics‑Ready 3D Human‑Scene Interaction

HSImul3R introduces a physics‑in‑the‑loop reconstruction pipeline that closes the perception‑simulation gap by jointly optimizing human motion and scene geometry, leveraging reinforcement learning, direct simulation‑reward optimization, and a new HSIBench dataset to produce simulation‑ready 3D human‑scene interactions.

3D reconstructionDSROHSIBench

0 likes · 12 min read

HSImul3R: Bridging Perception and Simulation for Physics‑Ready 3D Human‑Scene Interaction

Data Party THU

Mar 29, 2026 · Artificial Intelligence

How LoGeR Enables Minute‑Long 3D Reconstruction with Hybrid Memory

The article presents LoGeR, a long‑context geometric reconstruction framework that combines test‑time‑training memory and sliding‑window attention to achieve minute‑scale, fully‑feedforward 3D reconstruction with superior accuracy on benchmarks such as KITTI and VBR.

3D reconstructionHybrid MemoryLoGeR

0 likes · 11 min read

How LoGeR Enables Minute‑Long 3D Reconstruction with Hybrid Memory

HyperAI Super Neural

Mar 26, 2026 · Artificial Intelligence

MIT’s Wave‑Former Reconstructs Fully Occluded Objects with 85% Precision, Boosting Recall to 72%

MIT researchers introduce Wave‑Former, a physics‑aware, generative‑AI framework for mmWave sensing that achieves high‑precision 3D reconstruction of completely hidden objects, raising recall from 54% to 72% while maintaining 85% precision and outperforming existing baselines on real‑world datasets.

3D reconstructionGenerative AIbenchmark

0 likes · 15 min read

MIT’s Wave‑Former Reconstructs Fully Occluded Objects with 85% Precision, Boosting Recall to 72%

AI Frontier Lectures

Mar 16, 2026 · Artificial Intelligence

How LoGeR Extends 3D Reconstruction to Thousands of Frames with Hybrid Memory

LoGeR, a new long‑context geometric reconstruction framework from DeepMind and UC Berkeley, uses a hybrid memory module combining test‑time‑training (TTT) and sliding‑window attention (SWA) to enable feed‑forward 3D reconstruction over sequences of up to tens of thousands of frames, achieving state‑of‑the‑art accuracy on KITTI, VBR, 7‑Scenes, ScanNetV2 and TUM‑Dynamics benchmarks.

3D reconstructionDeep LearningHybrid Memory

0 likes · 11 min read

How LoGeR Extends 3D Reconstruction to Thousands of Frames with Hybrid Memory

Data Party THU

Feb 19, 2026 · Artificial Intelligence

How Data Priors and Scene Parameterization Boost 3D Indoor Reconstruction

This thesis investigates the two core challenges of data prior utilization and scene parameterization in multi‑view RGB‑based 3D indoor reconstruction, proposing novel representations and learning‑based methods to improve reconstruction quality, generalization, and applicability across AR, robotics, and autonomous navigation.

3D reconstructioncomputer visiondata priors

0 likes · 8 min read

How Data Priors and Scene Parameterization Boost 3D Indoor Reconstruction

HyperAI Super Neural

Dec 6, 2025 · Artificial Intelligence

Quick Look at This Week’s Frontier AI Papers: DeepSeekMath‑V2, MedSAM‑3, SAM 3D, Qwen3‑VL, and M²

This roundup surveys five cutting‑edge AI papers—DeepSeekMath‑V2’s self‑verifiable mathematical reasoning, MedSAM‑3’s promptable medical image and video segmentation, SAM 3D’s single‑image 3D reconstruction, Qwen3‑VL’s high‑capacity vision‑language model, and the M² memory‑mesh transformer for image captioning—highlighting their key methods, benchmarks, and code links.

3D reconstructionImage CaptioningLarge Language Models

0 likes · 6 min read

Quick Look at This Week’s Frontier AI Papers: DeepSeekMath‑V2, MedSAM‑3, SAM 3D, Qwen3‑VL, and M²

AI Frontier Lectures

Nov 28, 2025 · Artificial Intelligence

How Meta’s SAM 3D Turns a Single Photo into Detailed 3D Models

Meta’s newly released SAM 3 and SAM 3D models enable single‑image 3D reconstruction and promptable segmentation, outperforming prior methods on benchmarks, introducing a shared perception encoder, a Presence Head to reduce hallucinations, and a two‑stage generation pipeline that produces high‑fidelity geometry and texture.

3D reconstructionMetaSAM 3

0 likes · 12 min read

How Meta’s SAM 3D Turns a Single Photo into Detailed 3D Models

Rare Earth Juejin Tech Community

Sep 26, 2025 · Artificial Intelligence

How AI is Revolutionizing 3D Content Creation for Immersive Experiences

The Volcano Engine Multimedia Lab showcases cutting‑edge AI‑driven 3D and VR technologies—including volume video, dual‑Gaussian modeling, topological‑aware representations, and the Beaver3D AIGC model—to lower creation barriers, enable real‑time immersive interaction, and bridge research breakthroughs with industry applications.

3D reconstructionAI-generated 3DAIGC3D

0 likes · 15 min read

How AI is Revolutionizing 3D Content Creation for Immersive Experiences

Sohu Tech Products

Jul 30, 2025 · Artificial Intelligence

How 3D Gaussian Splatting Enables Low‑Cost 3D Reconstruction from Simple Videos

This article explains how 3D Gaussian Splatting transforms ordinary video footage into high‑quality 3D reconstructions with minimal equipment, outlines the low‑cost workflow using ffmpeg and COLMAP, and discusses practical challenges and future possibilities for the technology.

3D reconstructionCOLMAPGaussian splatting

0 likes · 5 min read

How 3D Gaussian Splatting Enables Low‑Cost 3D Reconstruction from Simple Videos

AI Frontier Lectures

Jun 19, 2025 · Industry Insights

What Made SIGGRAPH 2025’s Top Papers Stand Out? A Deep Dive into Award‑Winning Research

SIGGRAPH 2025 announced record‑breaking submissions and awarded five best papers, several honorable mentions, and a Test‑of‑Time prize, highlighting breakthroughs in 3D reconstruction, neural fields, Monte‑Carlo rendering, cloth simulation, and IMU calibration, with detailed author, institution, and technical insights provided.

3D reconstructionAIMonte Carlo rendering

0 likes · 13 min read

What Made SIGGRAPH 2025’s Top Papers Stand Out? A Deep Dive into Award‑Winning Research

AI Frontier Lectures

Jun 16, 2025 · Artificial Intelligence

What Do the CVPR 2025 Awards Reveal About the Future of Computer Vision?

The CVPR 2025 awards spotlight groundbreaking work—from the VGGT transformer that predicts full 3D scenes in a single feed‑forward pass to neural inverse rendering that reconstructs geometry from time‑resolved light—offering a comprehensive view of emerging trends, novel architectures, and performance breakthroughs across computer‑vision research.

3D reconstructionCVPR 2025Deep Learning

0 likes · 11 min read

What Do the CVPR 2025 Awards Reveal About the Future of Computer Vision?

AI Frontier Lectures

Apr 28, 2025 · Artificial Intelligence

How DP-Recon Uses Diffusion Models to Reconstruct 3D Scenes from Sparse Photos

DP-Recon leverages generative diffusion priors and a visibility‑guided SDS loss to achieve high‑fidelity, compositional 3D scene reconstruction from extremely sparse images, delivering superior geometry, texture, and text‑driven editing capabilities demonstrated on benchmark datasets and real‑world indoor scenarios.

3D reconstructionAIDiffusion Models

0 likes · 10 min read

How DP-Recon Uses Diffusion Models to Reconstruct 3D Scenes from Sparse Photos

Sohu Tech Products

Sep 11, 2024 · Artificial Intelligence

Low‑Cost 3D Reconstruction Using 3D Gaussian Splatting

This article explains how to create high‑quality 3D scenes from ordinary video footage by slicing frames with ffmpeg, extracting camera poses with COLMAP, and applying 3D Gaussian Splatting to replace traditional mesh‑texture pipelines, dramatically lowering equipment costs and data size.

3D reconstructionCOLMAPFFmpeg

0 likes · 6 min read

Low‑Cost 3D Reconstruction Using 3D Gaussian Splatting

AsiaInfo Technology: New Tech Exploration

Jan 12, 2024 · Artificial Intelligence

Exploring NeRF: From Theory to Real-World 3D Reconstruction Tools

This article introduces Neural Radiance Fields (NeRF) as a cutting‑edge AI technique for high‑quality 3D reconstruction, explains its core principles and advantages, outlines a step‑by‑step building workflow, reviews popular open‑source libraries such as Luma AI, NVIDIA Instant NeRF and NeRFStudio, and offers a forward‑looking summary of its potential and challenges.

3D reconstructionAINeRF

0 likes · 12 min read

Exploring NeRF: From Theory to Real-World 3D Reconstruction Tools

Bilibili Tech

Nov 1, 2023 · Artificial Intelligence

Neural Radiance Fields and Generative Intelligent Media: Recent Advances and Applications

Professor Hu Qiang presented recent progress in Neural Radiance Fields—covering implicit/explicit representations, hybrid models, and solutions for dynamic scenes, cloud‑based and edge‑cloud rendering—while also reviewing generative AI advances such as diffusion‑based text‑to‑image/video/3D, LoRA fine‑tuning, and large‑scale story‑book datasets, highlighting applications in virtual‑real content, smart‑city modeling, and 6‑DoF e‑commerce displays.

3D reconstructionDiffusion ModelsGenerative AI

0 likes · 14 min read

Neural Radiance Fields and Generative Intelligent Media: Recent Advances and Applications

DataFunTalk

Sep 28, 2023 · Artificial Intelligence

Panoramic Image Indoor Layout Estimation Using Vision Transformer (PanoViT)

This article introduces the PanoViT method for indoor layout estimation from panoramic images, covering research background, the transformer‑based architecture with backbone, vision transformer encoder, boundary‑enhancement and 3D loss modules, experimental results, and step‑by‑step usage in ModelScope.

3D reconstructionindoor layout estimationpanoramic vision transformer

0 likes · 7 min read

Panoramic Image Indoor Layout Estimation Using Vision Transformer (PanoViT)

DataFunSummit

Aug 24, 2023 · Artificial Intelligence

Panoramic Indoor Layout Estimation with Vision Transformer (PanoViT)

This article introduces the PanoViT model, a vision‑transformer‑based approach for indoor layout estimation from panoramic images, covering its research background, architectural components, experimental results on public datasets, and step‑by‑step usage within ModelScope.

3D reconstructionDeep LearningModelScope

0 likes · 8 min read

Panoramic Indoor Layout Estimation with Vision Transformer (PanoViT)

DaTaobao Tech

Jun 14, 2023 · Artificial Intelligence

Optimizing NeRF for Real-Time Mobile 3D Rendering in Alibaba's Object Drawer

Alibaba’s Taobao engineers detail how they transformed slow, high‑quality NeRF reconstruction into a real‑time mobile solution by combining an Octree‑Tiny‑MLP architecture, SNeRG optimizations, and a high‑frequency voxel reduction that shrank models to ~5 MB and achieved ~6 FPS on low‑end Android phones, targeting sub‑1 MB models and 50 FPS.

3D reconstructionMobile OptimizationNeRF

0 likes · 10 min read

Optimizing NeRF for Real-Time Mobile 3D Rendering in Alibaba's Object Drawer

DaTaobao Tech

Jun 10, 2022 · Artificial Intelligence

NeRF-Editing: Geometry Editing of Neural Radiance Fields

NeRF‑Editing introduces an interactive framework that lets users freely deform the geometry of neural radiance fields by coupling an explicit mesh with implicit NeRF representations, propagating mesh vertex changes through tetrahedral ARAP optimization to bend rays during rendering, enabling realistic edits and animations on synthetic and real‑world scenes, a first reported at CVPR 2022.

3D reconstructionARAP deformationNeRF

0 likes · 6 min read

NeRF-Editing: Geometry Editing of Neural Radiance Fields

Alibaba Terminal Technology

Mar 16, 2022 · Cloud Computing

Transforming Immersive Streaming with Free-Viewpoint Video: Capture to Cloud

This article explains the end‑to‑end workflow of free‑viewpoint video technology—from multi‑camera on‑site capture and hardware setup, through cloud‑based 3D reconstruction, depth estimation and encoding, to mobile SDK rendering—highlighting the technical challenges and optimizations that enable real‑time immersive streaming.

3D reconstructioncloud renderingfree-viewpoint

0 likes · 10 min read

Transforming Immersive Streaming with Free-Viewpoint Video: Capture to Cloud

Kuaishou Tech

Sep 17, 2021 · Artificial Intelligence

SnowflakeNet: Point Cloud Completion by Snowflake Point Deconvolution with Skip-Transformer

SnowflakeNet introduces a novel Snowflake Point Deconvolution architecture combined with a Skip-Transformer to progressively split seed points, enabling high‑quality point‑cloud completion that preserves fine‑grained geometric details such as smooth surfaces, sharp edges, and corners across dense and sparse datasets.

3D reconstructionDeep LearningSnowflakeNet

0 likes · 10 min read

SnowflakeNet: Point Cloud Completion by Snowflake Point Deconvolution with Skip-Transformer

Kuaishou Tech

Apr 16, 2021 · Artificial Intelligence

Camera-Space Hand Mesh Recovery via Semantic Aggregation and Adaptive 2D-1D Registration

Camera-space hand mesh recovery (CMR) leverages semantic aggregation of 2D cues and adaptive 2D‑1D registration to predict absolute 3D hand pose and shape directly in camera coordinates, improving accuracy on benchmarks such as FreiHAND, RHD, and Human3.6M.

2D-1D registration3D reconstructioncamera space

0 likes · 17 min read

Camera-Space Hand Mesh Recovery via Semantic Aggregation and Adaptive 2D-1D Registration

Suning Technology

Sep 17, 2020 · Artificial Intelligence

Unlocking Retail Innovation: 3D Digital Storebuilding with Multi‑Camera Vision

This article explores how 3D digital storebuilding integrates multiple visual sensors, GPU acceleration, and advanced camera calibration to create high‑precision, real‑time digital twins of retail spaces, enabling fine‑grained lifecycle management and immersive customer experiences.

3D reconstructionGPU Accelerationcamera calibration

0 likes · 15 min read

Unlocking Retail Innovation: 3D Digital Storebuilding with Multi‑Camera Vision

Didi Tech

Sep 10, 2020 · Artificial Intelligence

Technical Overview of DiDi's AR Indoor Navigation System

DiDi's AR indoor navigation system addresses GPS unreliability in large indoor venues by using SfM-based 3D reconstruction, robust visual localization with magnetometer/GNSS priors, and sensor fusion with pedestrian dead‑reckoning and deep‑learning heading estimation, cutting passenger pick‑up time by up to 25 % across dozens of airports and malls.

3D reconstructionAR navigationIndoor positioning

0 likes · 19 min read

Technical Overview of DiDi's AR Indoor Navigation System