Adversarial Robustness Toolbox v1.17 releases: crafting and analysis of attacks and defense methods for machine learning models

by do son · Published July 13, 2019 · Updated December 27, 2023

Adversarial Robustness Toolbox

Adversarial Robustness 360 Toolbox (ART) is a Python library supporting developers and researchers in defending Machine Learning models (Deep Neural Networks, Gradient Boosted Decision Trees, Support Vector Machines, Random Forests, Logistic Regression, Gaussian Processes, Decision Trees, Scikit-learn Pipelines, etc.) against adversarial threats and helps to make AI systems more secure and trustworthy. Machine Learning models are vulnerable to adversarial examples, which are inputs (images, texts, tabular data, etc.) deliberately modified to produce a desired response by the Machine Learning model. ART provides the tools to build and deploy defenses and test them with adversarial attacks.

Defending Machine Learning models involves certifying and verifying model robustness and model hardening with approaches such as pre-processing inputs, augmenting training data with adversarial samples, and leveraging runtime detection methods to flag any inputs that might have been modified by an adversary. The attacks implemented in ART allow creating adversarial attacks against Machine Learning models which are required to test defenses with state-of-the-art threat models.

Supported attack and defense methods

The Adversarial Robustness Toolbox contains implementations of the following attacks:

Deep Fool (Moosavi-Dezfooli et al., 2015)
Fast Gradient Method (Goodfellow et al., 2014)
Jacobian Saliency Map (Papernot et al., 2016)
Universal Perturbation (Moosavi-Dezfooli et al., 2016)
Virtual Adversarial Method (Moosavi-Dezfooli et al., 2015)
C&W Attack (Carlini and Wagner, 2016)
NewtonFool (Jang et al., 2017)

The following defense methods are also supported:

Feature squeezing (Xu et al., 2017)
Spatial smoothing (Xu et al., 2017)
Label smoothing (Warde-Farley and Goodfellow, 2016)
Adversarial training (Szegedy et al., 2013)
Virtual adversarial training (Miyato et al., 2017)

The details of the work from IBM research can be found in the research paper. The ART toolbox is developed with the goal of helping developers better understand

Measuring model robustness
Model hardening
Runtime detection

Changelog v1.17

This release of ART 1.17.0 introduces new adversarial training protocols, membership inference attacks, composite adversarial attacks for evasion and more.

Added

Added Composite Adversarial Attack as evasion attack in PyTorch (#2287)
Added support for black-box membership inference attacks without true labels (#2293)
Added verbose option for progress bars in methods fit and predict of all classification estimators (#2334)
Added Oracle Aligned Adversarial Training (OAAT) in PyTorch (#2348)

Fixed

Fixed bug in ActivateDefense and SpectralSignatures poisoning defences by flattening the outputs when calling get_activations() (#2327)
Fixed bug in Hugging Face classification estimator to correctly infer device if provided model is already on GPU (#2300)

Adversarial Robustness Toolbox v1.17 releases: crafting and analysis of attacks and defense methods for machine learning models

Search

Brilliantly

Content & Links

Adversarial Robustness Toolbox v1.17 releases: crafting and analysis of attacks and defense methods for machine learning models

Adversarial Robustness Toolbox

Supported attack and defense methods

Changelog v1.17

Added

Fixed

Download && Tutorial

Search

Brilliantly

Content & Links