Artificial Intelligence 12 min read

Enhance the Visual Representation via Discrete Adversarial Training

The Alibaba AAIG team proposes Discrete Adversarial Training (DAT), which leverages VQGAN‑based discretization to generate natural‑looking adversarial samples that improve visual representation robustness and transferability across classification, self‑supervised learning, and object detection tasks without sacrificing accuracy, achieving new state‑of‑the‑art results on multiple benchmarks.

DataFunTalk
DataFunTalk
DataFunTalk
Enhance the Visual Representation via Discrete Adversarial Training

01 Background

Current computer‑vision pipelines consist of visual representation extraction followed by downstream task fine‑tuning; the quality of the extracted representation determines the upper bound of downstream performance. Transferability and robustness are two major limitations: transferability measures how well a representation trained on one task works on others, while robustness measures performance under distribution shifts. Both have been active research topics.

Improving visual representations typically involves stronger models or richer data augmentations, but conventional random augmentations are inefficient. Adversarial training, which generates hard examples via adversarial attacks, can systematically expose model weaknesses and thus significantly boost both transferability and robustness.

02 Challenges

In industrial settings, adversarial training faces two major obstacles: (1) it multiplies training cost, and (2) it often creates a trade‑off between accuracy on clean data and robustness, known as the "Accuracy vs. Robustness Trade‑Off". While the first issue can be tolerated, the second is a fatal barrier to deployment.

03 Our Method

We aim to resolve the accuracy‑robustness trade‑off and make adversarial training practical for real‑world models. We first analyze the discrepancy between "natural adversarial" samples encountered in practice (e.g., object‑level transformations, lighting changes) and pixel‑level adversarial perturbations generated by standard methods. To bridge this gap, we propose Discrete Adversarial Training (DAT), which treats images as discrete token sequences using a VQ‑GAN codebook.

DAT replaces vulnerable codebook indices with adversarial ones, leveraging a Straight‑Through Estimator to back‑propagate gradients through the discrete space without costly VQ‑GAN gradient computation. The perturbed discrete codes are decoded back to images, yielding natural‑looking adversarial samples that serve as data augmentation.

Visualizations (Fig. 2) show that DAT samples have higher Pearson correlation with normal images in BatchNorm statistics, indicating greater similarity to natural data.

Fig. 3 illustrates the overall DAT algorithm flow.

04 Experimental Results

We evaluate DAT on three tasks: image classification, self‑supervised learning, and object detection.

Classification

Using distribution‑shift benchmarks (ImageNet‑A, ImageNet‑C, ImageNet‑R, ImageNet‑Sketch) and adversarial test sets (FGSM, DamageNet), DAT improves robustness for both ResNet‑50 (CNN) and ViT‑B (Transformer) backbones. Combined with MAE pre‑training, the ViT‑H+DAT model achieves first‑place results on ImageNet‑C and ImageNet‑Stylized, setting new SOTA.

Self‑Supervised Learning

We integrate DAT into MoCo‑v3, SimCLR, and SimSiam pre‑training pipelines. By maximizing contrastive error during adversarial perturbation, DAT enhances the downstream transferability of the learned representations, consistent with prior findings that adversarial training benefits transfer learning.

Object Detection

On COCO, we train two lightweight detectors (EfficientDet and a small detector) with 512‑pixel inputs. DAT improves both mAP on COCO and robustness mAP on COCO‑C, outperforming the recent Det‑AdvProp method, especially for low‑resolution models.

Ablation Studies

DAT samples exhibit higher visual quality: lower FID (14.65 vs. 65.18) and fewer noisy pixels.

Frequency analysis shows DAT perturbations concentrate on low‑frequency components, unlike pixel‑level attacks that add high‑frequency noise.

DAT perturbations are more structured, focusing on semantically important regions rather than random noise.

05 About Us – Research Intern Recruitment

The AAIG Security Lab focuses on AI safety, robustness, interpretability, fairness, transferability, privacy, and causal inference, publishing in top venues such as NeurIPS, CVPR, and IEEE S&P. The lab collaborates with leading Chinese universities and is currently hiring research interns interested in AI security. Interested candidates can email [email protected].

If you appreciate Alibaba Security's innovative ideas, feel free to star the GitHub project.

machine learningcomputer visionrobustnessadversarial trainingdiscrete adversarial trainingvisual representation
DataFunTalk
Written by

DataFunTalk

Dedicated to sharing and discussing big data and AI technology applications, aiming to empower a million data scientists. Regularly hosts live tech talks and curates articles on big data, recommendation/search algorithms, advertising algorithms, NLP, intelligent risk control, autonomous driving, and machine learning/deep learning.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.