Artificial Intelligence 11 min read

Recent ACM MM and ECCV Papers on Intelligent Creative Technologies by Alibaba

Alibaba’s Creative & Video Platform showcases six newly accepted ACM MM and ECCV papers that introduce self‑supervised text‑erasing, a confidence‑driven action‑proposal module, a geometry‑aligned variational transformer for image‑conditioned layouts, a high‑resolution virtual‑try‑on system, a motion‑transformer for unsupervised animation, and a cross‑domain motion‑transfer framework, highlighting cutting‑edge AI for automated creative design, video editing, and e‑commerce applications.

Alimama Tech

Jul 6, 2022

Recent ACM MM and ECCV Papers on Intelligent Creative Technologies by Alibaba

Alibaba's Creative & Video Platform presents a collection of six recent research papers accepted at top computer‑vision conferences ACM MM and ECCV, showcasing new algorithms for intelligent creative generation.

Self‑Supervised Text Erasing with Controllable Image Synthesis (ACM MM) : Proposes a self‑supervised framework that automatically synthesizes training data for text removal in images. It combines a controllable text synthesis module, a strategy network for style selection, and a triplet erasing loss, achieving state‑of‑the‑art results on high‑resolution PosterErase without any manual annotations.

Estimation of Reliable Proposal Quality for Temporal Action Detection (ACM MM) : Introduces BREM, a plug‑and‑play module that predicts reliable action‑localization confidence by jointly modeling boundary confidence (BEM) and region confidence (REM). The approach improves proposal quality on ActivityNet and THUMOS14 and is deployed in Alibaba’s “smart mixing” video editing tool.

Geometry Aligned Variational Transformer for Image‑Conditioned Layout Generation (ACM MM) : Presents a variational transformer that generates aesthetically pleasing layouts conditioned on background images. It uses cross‑attention to fuse visual features, a geometry‑alignment module to bridge visual and layout feature distributions, and demonstrates strong performance on a large advertising‑poster dataset.

High‑Resolution Image‑Based Virtual Try‑On System for Taobao (ECCV Demo) : Builds a large‑scale, high‑resolution virtual try‑on dataset for e‑commerce and proposes a knowledge‑distillation based system that produces high‑quality virtual fitting images without costly preprocessing, boosting click‑through rates for thousands of clothing items.

Motion Transformer for Unsupervised Image Animation (ECCV) : Proposes a vision‑transformer‑based motion estimator that introduces image tokens and motion tokens, enabling better modeling of motion interactions. The method outperforms CNN‑based baselines on several benchmarks and can generate product‑showcase videos from a single image.

Motion and Appearance Adaptation for Cross‑Domain Motion Transfer (ECCV) : Introduces the MAA framework that regularizes synthesized objects for cross‑domain motion transfer, featuring a shape‑invariant motion adaptation module and a structure‑guided appearance consistency module, achieving superior results on Mixamo‑to‑Fashion and Vox‑Celeb‑to‑Cufs datasets.

These works illustrate Alibaba’s advances in computer‑vision techniques that empower automated creative design, video editing, and virtual try‑on applications.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

image synthesis virtual try-on motion transformer video action detection

Written by

Alimama Tech

Official Alimama tech channel, showcasing all of Alimama's technical innovations.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.