Artificial Intelligence 11 min read

Recent ACM MM and ECCV Papers on Intelligent Creative Technologies by Alibaba

Alibaba’s Creative & Video Platform showcases six newly accepted ACM MM and ECCV papers that introduce self‑supervised text‑erasing, a confidence‑driven action‑proposal module, a geometry‑aligned variational transformer for image‑conditioned layouts, a high‑resolution virtual‑try‑on system, a motion‑transformer for unsupervised animation, and a cross‑domain motion‑transfer framework, highlighting cutting‑edge AI for automated creative design, video editing, and e‑commerce applications.

Alimama Tech
Alimama Tech
Alimama Tech
Recent ACM MM and ECCV Papers on Intelligent Creative Technologies by Alibaba

Alibaba's Creative & Video Platform presents a collection of six recent research papers accepted at top computer‑vision conferences ACM MM and ECCV, showcasing new algorithms for intelligent creative generation.

Self‑Supervised Text Erasing with Controllable Image Synthesis (ACM MM) : Proposes a self‑supervised framework that automatically synthesizes training data for text removal in images. It combines a controllable text synthesis module, a strategy network for style selection, and a triplet erasing loss, achieving state‑of‑the‑art results on high‑resolution PosterErase without any manual annotations.

Estimation of Reliable Proposal Quality for Temporal Action Detection (ACM MM) : Introduces BREM, a plug‑and‑play module that predicts reliable action‑localization confidence by jointly modeling boundary confidence (BEM) and region confidence (REM). The approach improves proposal quality on ActivityNet and THUMOS14 and is deployed in Alibaba’s “smart mixing” video editing tool.

Geometry Aligned Variational Transformer for Image‑Conditioned Layout Generation (ACM MM) : Presents a variational transformer that generates aesthetically pleasing layouts conditioned on background images. It uses cross‑attention to fuse visual features, a geometry‑alignment module to bridge visual and layout feature distributions, and demonstrates strong performance on a large advertising‑poster dataset.

High‑Resolution Image‑Based Virtual Try‑On System for Taobao (ECCV Demo) : Builds a large‑scale, high‑resolution virtual try‑on dataset for e‑commerce and proposes a knowledge‑distillation based system that produces high‑quality virtual fitting images without costly preprocessing, boosting click‑through rates for thousands of clothing items.

Motion Transformer for Unsupervised Image Animation (ECCV) : Proposes a vision‑transformer‑based motion estimator that introduces image tokens and motion tokens, enabling better modeling of motion interactions. The method outperforms CNN‑based baselines on several benchmarks and can generate product‑showcase videos from a single image.

Motion and Appearance Adaptation for Cross‑Domain Motion Transfer (ECCV) : Introduces the MAA framework that regularizes synthesized objects for cross‑domain motion transfer, featuring a shape‑invariant motion adaptation module and a structure‑guided appearance consistency module, achieving superior results on Mixamo‑to‑Fashion and Vox‑Celeb‑to‑Cufs datasets.

These works illustrate Alibaba’s advances in computer‑vision techniques that empower automated creative design, video editing, and virtual try‑on applications.

computer visionimage synthesisvirtual try-onself-supervised learninglayout generationmotion transformervideo action detection
Alimama Tech
Written by

Alimama Tech

Official Alimama tech channel, showcasing all of Alimama's technical innovations.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.