Why Early DI Attacks Outperform Modern Methods: A Systematic Study of Transferable Adversarial Images

This paper systematically evaluates 23 transferable adversarial attacks and 11 defenses on ImageNet, revealing that early DI attacks surpass newer methods when hyper‑parameters are fairly set, diffusion defenses offer false security, and higher transferability often reduces stealthiness, urging fair benchmarking and comprehensive metrics.

AI Frontier Lectures
AI Frontier Lectures
AI Frontier Lectures
Why Early DI Attacks Outperform Modern Methods: A Systematic Study of Transferable Adversarial Images

Background

Adversarial examples that transfer across models threaten black‑box deep learning systems. An image crafted to fool one model can mislead many unknown models, making transferability a key risk factor.

Problem

Prior work suffers from (1) inconsistent hyper‑parameter settings when evaluating transferability, leading to incomparable results, and (2) limited assessment of stealthiness, usually only reporting Lp constraints without perceptual quality or traceability.

Methodology

We adopt a full machine‑learning lifecycle view and categorize transfer attacks into five groups. We benchmark 23 representative attacks and 11 defenses (including transfer‑focused and real‑world vision‑API defenses) on ImageNet. Transferability is measured under equal Lp budget, iteration budget, and step size. Stealthiness is evaluated with PSNR, SSIM, LPIPS and an “attack traceback” analysis that visualizes where perturbations concentrate.

Key Findings

Early DI‑based attacks outperform many later methods when hyper‑parameters are equalized, indicating that apparent improvements often stem from more favorable settings rather than intrinsic superiority.

Diffusion‑based defenses give a false sense of security ; they resist white‑box or adaptive attacks but are easily bypassed by black‑box transfer attacks.

Stealthiness varies widely under the same Lp constraint , and higher transferability typically reduces perceptual quality, showing a negative correlation between transferability and stealthiness.

Evaluation Framework

The framework classifies attacks, provides detailed tables of methods, and visualizes results. Representative attacks include DI, DI‑2‑Step, TI, SI, MI‑FSGM, among others. Defenses include adversarial training, randomized smoothing, and diffusion‑based purification (DiffPure). All code and evaluation scripts are released at https://github.com/ZhengyuZhao/TransferAttackEval. The pre‑print is available at https://arxiv.org/abs/2310.11850.

Future Directions

We recommend one‑to‑one, hyper‑parameter‑fair comparisons; reporting both transferability and multiple perceptual/stealthiness metrics; evaluating defenses against transferable black‑box attacks; and open‑sourcing code, hyper‑parameters, and scripts for reproducibility.

evaluation benchmarkadversarial attacksImageNettransferabilitydeep learning robustnessstealthiness metrics
AI Frontier Lectures
Written by

AI Frontier Lectures

Leading AI knowledge platform

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.