Artificial Intelligence 13 min read

Deep Generative Projection for High‑Fidelity Virtual Try‑On

The paper presents Deep Generative Projection (DGP), a virtual‑try‑on system that learns a realistic dressing distribution from unpaired images with StyleGAN, projects coarse garment‑human alignments into its latent space, refines details, and achieves higher fidelity and robustness than supervised SOTA methods without needing paired data.

DaTaobao Tech
DaTaobao Tech
DaTaobao Tech
Deep Generative Projection for High‑Fidelity Virtual Try‑On

This work introduces a novel virtual try‑on pipeline that first fits the distribution of real‑world dressing results using a deep generative adversarial network (GAN) and then projects coarse garment‑human alignments onto this distribution to obtain high‑quality predictions.

Traditional virtual try‑on methods rely on costly paired data and complex physical modeling of garment deformation, which limits scalability and robustness. Human perception, however, can infer plausible dressing outcomes from coarse alignments and prior experience, suggesting a probabilistic approach.

The proposed method leverages large amounts of inexpensive, unpaired human images to train a StyleGAN that captures the dressing distribution. A three‑layer projection operator maps a coarse garment‑human alignment to the latent space of the GAN, retrieving the nearest realistic sample. Subsequent semantic search refines garment type, style, and local details, while a final pattern search optimizes generator parameters to reconstruct high‑frequency patterns such as text and logos.

Experiments compare the method (DGP) against three supervised SOTA algorithms (ACGPN, VITON‑HD, PF‑AFN) on both the newly collected CMI dataset and the public MPV dataset. Qualitative results show clearer, more realistic garments, and quantitative metrics (FID, SWD, user satisfaction) demonstrate consistent superiority despite DGP being trained only on unsupervised data.

Additional analyses reveal strong robustness to preprocessing errors, where the projection module automatically corrects flawed inputs, outperforming other GAN inversion techniques (e.g., pSp).

In conclusion, the approach eliminates the need for expensive paired data, achieves high fidelity and robustness, and opens the door for large‑scale commercial virtual try‑on applications.

computer visionvirtual try-ongenerative adversarial networkimage projectionunsupervised learning
DaTaobao Tech
Written by

DaTaobao Tech

Official account of DaTaobao Technology

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.