How Laplacian Pyramid Networks Revolutionize Intrinsic Image Decomposition
This paper presents a scale‑space‑based deep neural network that treats intrinsic image decomposition as an image‑to‑image translation, introduces a multi‑channel Laplacian pyramid architecture with novel loss and data‑augmentation strategies, and demonstrates superior performance on MPI‑Sintel and MIT Intrinsic datasets.
Abstract
We introduce a novel network architecture that decomposes an image into its intrinsic reflectance and illumination components. By treating the problem as an image‑to‑image translation and performing scale‑space decomposition of inputs and outputs, we design a multi‑channel network that learns a separate mapping for each Laplacian pyramid level. The network uses learnable up/down‑sampling operators to build Gaussian and Laplacian pyramids, and each sub‑network (residual block) consists of six Conv‑ELU layers. A new loss combines data, perceptual, and variational terms, and a data‑augmentation scheme inspired by breeder learning generates additional training pairs from unlabeled images. Experiments on MPI‑Sintel and MIT Intrinsic Images datasets show significant quantitative and qualitative improvements over previous state‑of‑the‑art methods.
1 Introduction
Intrinsic image decomposition separates an image into reflectance and shading, enabling material editing, depth cue extraction, and illumination analysis. Existing approaches such as Retinex are limited by gradient‑domain thresholds and cannot handle complex materials or sharp edges. We propose a deep neural network that learns the mapping in a scale‑space framework, extending the function approximation pipeline to parallel sub‑band transformations.
2 Our Method
2.1 Network Evolution
Starting from a ResNet‑style sequential architecture, we restructure the network into parallel low‑frequency (L) and high‑frequency (H) branches, applying a Laplacian pyramid expansion to the output. This yields a multi‑branch design where each branch predicts a specific pyramid component.
2.2 Residual Block
Each residual block is a six‑layer Conv(3×3)‑ELU stack without fully‑connected layers. Skip connections add the output of the final Conv to the input of the last ELU. The block uses 32 feature channels and outputs either a 3‑channel image or a residual map. ELU replaces ReLU and batch normalization, improving robustness to noise and training speed.
2.3 Loss Function
Our loss comprises three terms:
Data loss : a joint bilateral filter enforcing pixel‑level similarity and the constraint that the product of predicted reflectance and shading reconstructs the input.
Perceptual loss : VGG‑19 features preserve high‑level semantic structure.
Variational loss : smoothness regularization on the output.
2.4 Data‑Augmentation Training
We generate additional training pairs by first predicting reflectance and shading for unlabeled images, then synthesizing new images from these estimates. Adaptive manifold filtering further perturbs the predictions to increase diversity while preserving realism.
3 Experiments
3.1 Datasets
We evaluate on the MPI‑Sintel (ResynthSintel version) and MIT Intrinsic Images datasets, using both scene‑level and image‑level train/test splits.
3.2 Results on MPI‑Sintel
Our model outperforms prior methods on all three metrics, especially under the more challenging scene‑split protocol, demonstrating robustness to over‑fitting.
3.3 Results on MIT Intrinsic Images
Both standard data augmentation (Ours + DA) and an enhanced version (Ours + DA +) improve performance, confirming the effectiveness of our augmentation strategy.
4 Conclusion
We introduced a Laplacian‑pyramid‑inspired neural network for intrinsic image decomposition, modeling the task as a multi‑scale image‑to‑image translation. Experiments on two benchmark datasets demonstrate superior quantitative results and visual quality, and the architecture is applicable to other image‑to‑image tasks such as semantic segmentation or depth regression.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Alibaba Cloud Developer
Alibaba's official tech channel, featuring all of its technology innovations.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
