Tagged articles

Depth Estimation

13 articles · Page 1 of 1

Jun 9, 2026 · Artificial Intelligence

Why Standard Vision‑Language Models + Scale Data Beat Specialized 3D Vision Designs (VLM³)

Meta’s VLM³ demonstrates that a plain vision‑language model, when trained on large‑scale data with simple camera‑focal‑length and pixel‑space normalization, matches or surpasses expert 3D vision models across monocular depth estimation, object‑level understanding, pixel‑matching and camera‑pose tasks, eliminating the need for task‑specific architectures, loss functions, data augmentations or regression formulations.

3D VisionDepth EstimationMeta

0 likes · 6 min read

Why Standard Vision‑Language Models + Scale Data Beat Specialized 3D Vision Designs (VLM³)

HyperAI Super Neural

Jun 8, 2026 · Artificial Intelligence

Meta’s VLM³ Boosts Depth Accuracy to 0.9 Using Qwen3‑VL‑4B for Unified 3D Tasks

Meta and Princeton introduce VLM³, a unified vision‑language framework built on Qwen3‑VL‑4B that models depth estimation, object‑level 3D understanding, pixel matching and camera pose estimation without extra encoders, achieving up to 0.90 depth accuracy and outperforming larger specialist models on multiple benchmarks.

3D PerceptionDepth EstimationMulti-Task Learning

0 likes · 15 min read

Meta’s VLM³ Boosts Depth Accuracy to 0.9 Using Qwen3‑VL‑4B for Unified 3D Tasks

HyperAI Super Neural

Dec 22, 2025 · Artificial Intelligence

DA3 Enables Arbitrary‑View 3D Reconstruction with a Single Transformer

The ByteDance‑Seed team introduces Depth Anything 3 (DA3), a minimalist visual‑geometry model that uses a vanilla Transformer backbone and depth‑ray representation to jointly predict depth and camera pose from any number of images, achieving state‑of‑the‑art performance with a 35.7% gain in pose accuracy and a 23.6% improvement in geometric precision over prior methods.

3D VisionDA3Depth Estimation

0 likes · 6 min read

DA3 Enables Arbitrary‑View 3D Reconstruction with a Single Transformer

JD Cloud Developers

Apr 22, 2025 · Artificial Intelligence

How AI Turns 2D Videos into Immersive 3D Spatial Content at Scale

Leveraging 3D vision and AIGC, JD Retail’s R&D team converts abundant 2D video assets into high‑quality stereoscopic 3D space videos through a pipeline that includes monocular depth estimation, novel view synthesis, multi‑branch inpainting, and MV‑HEVC encoding, validated by ICME 2025 and a new StereoV1K dataset.

3D videoAIGCDepth Estimation

0 likes · 26 min read

How AI Turns 2D Videos into Immersive 3D Spatial Content at Scale

JD Tech

Apr 21, 2025 · Artificial Intelligence

End-to-End 3D Spatial Video Generation via Monocular Depth Estimation, Novel View Synthesis, and MV‑HEVC Encoding

This article presents a comprehensive AI‑driven pipeline that converts 2D video into immersive 3D spatial video by leveraging monocular depth estimation, depth‑warping novel view synthesis, a multi‑branch inpainting module, a large‑scale StereoV1K dataset, and efficient MV‑HEVC compression, with results validated at ICME 2025 and deployed in JD Vision services.

3D videoAIAIGC

0 likes · 20 min read

End-to-End 3D Spatial Video Generation via Monocular Depth Estimation, Novel View Synthesis, and MV‑HEVC Encoding

JD Retail Technology

Apr 16, 2025 · Artificial Intelligence

AI‑Driven 3D Spatial Video Generation from Monocular 2D Content with MV‑HEVC Encoding

This work presents an end‑to‑end AI pipeline that transforms existing monocular 2D videos into immersive 3D spatial streams by combining DINO‑v2‑based depth estimation, multi‑branch view synthesis, and MV‑HEVC encoding, achieving up to 33 % BD‑Rate reduction, 31 % speed gains, state‑of‑the‑art visual quality, and real‑time production suitability, validated on the new StereoV1K benchmark and deployed in JD.Vision’s e‑commerce catalog.

3D videoAI generationAIGC

0 likes · 21 min read

AI‑Driven 3D Spatial Video Generation from Monocular 2D Content with MV‑HEVC Encoding

Baidu Tech Salon

Apr 14, 2023 · Artificial Intelligence

How PaddleDepth and Paddle3D Enable Low‑Cost 3D Vision Development

This article examines the challenges of 3D vision data acquisition and explains how Baidu's PaddleDepth and Paddle3D toolkits provide low‑cost depth collection, super‑resolution, and end‑to‑end perception pipelines, showcasing performance on KITTI and Middlebury datasets with code examples.

3D VisionDepth EstimationPaddle3D

0 likes · 12 min read

How PaddleDepth and Paddle3D Enable Low‑Cost 3D Vision Development

Kuaishou Audio & Video Technology

Dec 30, 2022 · Artificial Intelligence

Unlocking Realistic Bokeh: Depth‑Aware Algorithms Behind Holiday Video Effects

This article explains the optical principles of bokeh (scatter blur), describes a depth‑aware variable‑focus algorithm developed by Kuaishou’s audio‑video team, and details practical optimizations such as saliency detection, edge‑preserving weighting, and adaptive spot‑light effects that enable realistic, customizable holiday video filters.

BokehDepth EstimationImage processing

0 likes · 11 min read

Unlocking Realistic Bokeh: Depth‑Aware Algorithms Behind Holiday Video Effects

DataFunTalk

Jun 30, 2022 · Artificial Intelligence

Self‑Augmented Unpaired Image Dehazing via Density and Depth Decomposition (D4)

The paper introduces D4, a self‑augmented unpaired image dehazing framework that decomposes the transmission map into fog density and scene depth, enabling realistic fog synthesis for data augmentation and achieving superior dehazing performance with fewer parameters and FLOPs on multiple benchmarks.

CVPR2022Depth Estimationcomputer vision

0 likes · 14 min read

Self‑Augmented Unpaired Image Dehazing via Density and Depth Decomposition (D4)

Kuaishou Tech

Feb 9, 2022 · Mobile Development

Kuaishou Mobile Mixed Reality System: Architecture, Algorithms, and Applications

This article presents Kuaishou's mobile mixed reality (MR) system, detailing its integration of deep learning, SLAM, and scene reconstruction for real‑time spatial computing, the design of a monocular depth‑estimation model, a lightweight 3D rendering engine, and its deployment across iOS and Android devices with various user‑facing effects.

Depth EstimationKuaishouMobile AR

0 likes · 16 min read

Kuaishou Mobile Mixed Reality System: Architecture, Algorithms, and Applications

JD Retail Technology

Aug 2, 2021 · Artificial Intelligence

Real-time Monocular Human Depth Estimation and Segmentation on Embedded Systems (HDES-Net)

The paper presents HDES‑Net, a lightweight real‑time monocular human depth estimation and segmentation network designed for embedded platforms, using MobileNetV1 backbone with ASPP and depth‑wise separable convolutions, achieving high accuracy on CAD‑60 and EPFL‑RGBD datasets while running at up to 199.93 FPS on a Tesla P40 and 17.23 FPS on a Jetson Nano after TensorRT optimization.

Depth EstimationEmbedded AIHDES-Net

0 likes · 8 min read

Real-time Monocular Human Depth Estimation and Segmentation on Embedded Systems (HDES-Net)

TAL Education Technology

Jun 18, 2020 · Artificial Intelligence

An Overview of Virtual Reality, Augmented Reality, and Vision‑Based Techniques

This article explains the fundamentals of virtual reality and its distinction from augmented reality, describes VR hardware, outlines depth‑estimation and eye‑tracking methods such as projection, Hough transform, AdaBoost and sample matching, discusses Sobel edge detection, and explores the importance of audio, haptic feedback, and immersive VR applications in education.

ARDepth EstimationImmersive Education

0 likes · 11 min read

An Overview of Virtual Reality, Augmented Reality, and Vision‑Based Techniques

iQIYI Technical Product Team

May 8, 2020 · Artificial Intelligence

Deep Learning‑Based 2D‑to‑3D Conversion for VR Content

iQIYI’s deep‑learning pipeline converts single‑view images into high‑quality stereo pairs for VR by training on side‑by‑side 3D movies, employing a Monodepth‑based encoder‑decoder, a CVAE to encode camera parameters, ConvLSTM for temporal consistency, and disparity‑guided inpainting to fill occlusion holes, achieving stable, continuous depth maps validated through extensive human 3‑D effect assessments.

2D-to-3DDepth EstimationVR

0 likes · 12 min read

Deep Learning‑Based 2D‑to‑3D Conversion for VR Content