Random Parameter Pruning Boosts Transferable Targeted Attacks Across Model Architectures
The RaPA method introduces random parameter pruning during adversarial generation, creating diverse model variants that markedly increase the success rate of targeted transfer attacks across CNN and Transformer architectures, even against defended models and with higher computational budgets, as demonstrated on ImageNet‑compatible benchmarks.
Problem Context
Transfer‑based adversarial attacks generate perturbations on a surrogate (proxy) model and reuse them against unknown black‑box targets. Existing transfer attacks often depend on a small set of critical parameters in the surrogate model, which limits cross‑model generalisation.
Proposed Method: Random Parameter Pruning Attack (RaPA)
RaPA mitigates parameter‑dependency by randomly pruning a subset of the surrogate model’s parameters (primarily fully‑connected and normalization layers) at each attack iteration. The workflow for each iteration is:
Start from the current adversarial image.
Randomly mask a portion of the model’s parameters to create a pruned variant.
Instantiate multiple independently pruned copies of the surrogate model.
Compute the gradient of the targeted loss w.r.t. the image on each copy.
Average the gradients across all copies.
Update the image using the averaged gradient (e.g., FGSM‑style step).
The process repeats until the perturbation budget (e.g., ε = 16/255) is exhausted, yielding the final targeted adversarial example.
Experimental Setup
Dataset: ImageNet‑compatible dataset from the NIPS 2017 adversarial competition, containing true labels and target attack labels.
Victim models (both CNNs and visual Transformers): VGG16, ResNet18, ResNet50, DenseNet121, MobileNetV2, EfficientNetB0, Inception‑V1/V3/V4, Xception, ViT, LeViT, ConViT, Twins, PiT, CLIP.
Baseline attacks (grouped by technique):
Input‑transformation: DI, RDI, SIA, BSR.
Gradient‑optimisation: SI, MI‑FGSM.
Feature‑mixing: Admix, CFM, FTM.
Model‑ensemble: MUP, SE‑ViT.
All methods share identical hyper‑parameters: maximum ε, step size, number of iterations, and per‑iteration compute budget, ensuring fair comparison.
Results
CNN‑to‑Transformer transfer : Prior best average success ≈ 33 %; RaPA raises it to ≈ 45 %.
When ResNet50 is the source model, RaPA adds ≈ 11.7 % absolute success; with DenseNet121 as source, the gain reaches ≈ 17.5 %.
Transformer‑to‑CNN transfer : RaPA achieves ≈ 51 % average success, surpassing all baselines.
Robustness against defenses : Under adversarial training, JPEG compression, randomisation, denoising, and diffusion‑model defenses, RaPA maintains the highest success rates (e.g., ≈ 88 % on adversarially trained models).
Effect of additional compute : Increasing iteration count or per‑iteration compute benefits RaPA more than other attacks. For ResNet50, extra compute yields an additional ≈ 15.9 % success gain.
Compatibility and Extensibility
Random parameter pruning is orthogonal to existing techniques. Combining RaPA with Admix, CFM, or input‑transformation methods further improves transferability without requiring extra data or model retraining.
Key Observations
Empirical analysis shows that removing the most important parameters from a surrogate model sharply reduces attack success, confirming the over‑reliance of prior attacks on a narrow parameter subset. Randomly varying the parameter set forces the adversarial example to be optimised against a diverse ensemble, enhancing its ability to generalise across architectures.
Reference
Paper: https://arxiv.org/pdf/2504.18594
Machine Learning Algorithms & Natural Language Processing
Focused on frontier AI technologies, empowering AI researchers' progress.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
