DRDD: Turning Diffusion Noise into a Domain Harmonizer for Image Translation
The paper introduces Decoupled Residual Denoising Diffusion (DRDD), which reinterprets Gaussian noise as a domain harmonizer and separates residual removal from denoising, enabling more data‑efficient, multi‑task image‑to‑image translation and achieving state‑of‑the‑art results on benchmarks such as All‑in‑One‑5 with limited paired data.
Background: Diffusion Models for Image‑to‑Image Translation
Recent image‑to‑image (I2I) translation work adopts diffusion models that first mix the input image with Gaussian noise and then iteratively denoise to recover the target image. This paradigm has become the default for tasks such as super‑resolution, de‑rain, de‑haze, low‑light enhancement, and style transfer because of its high generation quality and diversity.
Problem: Coupled Denoising and Residual Removal
Earlier methods (e.g., SR3, WeatherDiff) start from pure noise, while later works (e.g., RDDM, IR‑SDE) begin from a noisy input to preserve structure. All of these approaches share a common design: the translation is compressed into a single, coupled reverse diffusion process where, at each step, the model simultaneously removes noise, removes residual, and performs the source‑to‑target domain conversion.
When a single model must handle many heterogeneous tasks (low‑light, de‑rain, de‑haze, de‑blur, etc.), this coupling becomes problematic because each task resides in a different domain with a large domain gap. The model is forced to learn a unified mapping across disparate distributions within one tightly coupled process.
Key Insight: Noise as a Domain Harmonizer
The authors ask: if adding noise brings different domain features closer, why remove the noise before the core translation is finished? They propose that Gaussian noise can act as a "Domain Harmonizer" that aligns feature distributions across domains.
DRDD Design: Decoupled Residual Denoising Diffusion
DRDD (Decoupled Residual Denoising Diffusion) restructures the diffusion pipeline into two independent stages:
Noise Diffusion (Stage 1) : Inject Gaussian noise into the target image, moving the target domain into a "noisy but more aligned" space. This stage achieves domain coordination.
Residual Diffusion (Stage 2) : With the noise level fixed, learn the residual mapping from source to target within the noise‑carrying domain, completing the semantic translation.
During generation, the reverse process mirrors this decoupling:
First remove the residual in the noisy domain, accomplishing the core source‑to‑target conversion.
Then denoise the result to obtain a clean target image.
This contrasts with traditional coupled diffusion, which performs "domain change + denoising" simultaneously. By postponing denoising until after the residual has been removed, DRDD preserves the domain‑harmonizing effect of the noise throughout the critical translation phase.
Advantages
1. Easier Unified Mapping : By first aligning domains with noise, the model learns a shared intermediate space, reducing the gap between tasks and making a single parameter set capable of handling multiple degradations.
2. Data Efficiency : The denoising stage only requires clean target‑domain images, not paired source‑target samples. Consequently, large collections of unpaired target images can be leveraged, dramatically lowering the need for expensive paired datasets.
Experimental Validation
DRDD was evaluated on several benchmarks:
All‑in‑One‑5 (five‑task unified restoration) : DRDD achieved 0.916 SSIM, 0.073 LPIPS, and 18.3 FID, outperforming DA‑CLIP, DiffuIR, AdAIR, VLUNet, and DFPIR, especially on perceptual quality.
Data‑Efficient I2I : Training data were randomly down‑sampled to 75 %, 50 %, and 25 % of the original size on Low‑Light and All‑in‑One‑3. DRDD’s performance degraded far less than DiffUIR and VLUNet, showing stable SSIM and LPIPS curves as data decreased.
Noise‑Strength Analysis : Theory and experiments identified an optimal noise intensity around 1.0–1.2 (theoretical 1.1–1.2). Empirically, the model performed best at noise level 1.0 and remained stable between 0.8 and 1.3.
These results demonstrate that DRDD’s gains stem from the decoupled design rather than simply using more paired data.
Implications
By treating noise as a useful intermediate domain rather than a nuisance, DRDD offers a new perspective on diffusion‑based I2I translation. The three‑step recipe—(1) use noise to shrink domain gaps, (2) perform core semantic mapping in the noisy space, (3) finally denoise for high‑fidelity output—provides a more natural framework for real‑world multi‑task translation where paired data are scarce.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
