Generative Adversarial Perturbations

Abstract

We propose novel generative models for creating adversarial examples, slightly perturbed images resembling natural images but maliciously crafted to fool trained models. Our approach can produce image-agnostic and image-dependent perturbations for both targeted and non-targeted attacks. We also demonstrate that similar architectures can achieve impressive results in fooling both classification and semantic segmentation models, obviating the need for hand-crafting attack methods for each task. We improve the state-of-the-art performance in universal perturbations by leveraging generative models in lieu of current iterative methods. Our attacks are considerably faster than iterative and optimization-based methods at inference time. Moreover, we are the first to present effective targeted universal perturbations.

Publication
CVPR
Date