Adversarial Robustness Toolbox v1.5.2 releases: crafting and analysis of attacks and defense methods for machine learning models
Adversarial Robustness Toolbox
Adversarial Robustness 360 Toolbox (ART) is a Python library supporting developers and researchers in defending Machine Learning models (Deep Neural Networks, Gradient Boosted Decision Trees, Support Vector Machines, Random Forests, Logistic Regression, Gaussian Processes, Decision Trees, Scikit-learn Pipelines, etc.) against adversarial threats and helps to make AI systems more secure and trustworthy. Machine Learning models are vulnerable to adversarial examples, which are inputs (images, texts, tabular data, etc.) deliberately modified to produce a desired response by the Machine Learning model. ART provides the tools to build and deploy defenses and test them with adversarial attacks.
Defending Machine Learning models involves certifying and verifying model robustness and model hardening with approaches such as pre-processing inputs, augmenting training data with adversarial samples, and leveraging runtime detection methods to flag any inputs that might have been modified by an adversary. The attacks implemented in ART allow creating adversarial attacks against Machine Learning models which are required to test defenses with state-of-the-art threat models.
Supported attack and defense methods
The Adversarial Robustness Toolbox contains implementations of the following attacks:
- Deep Fool (Moosavi-Dezfooli et al., 2015)
- Fast Gradient Method (Goodfellow et al., 2014)
- Jacobian Saliency Map (Papernot et al., 2016)
- Universal Perturbation (Moosavi-Dezfooli et al., 2016)
- Virtual Adversarial Method (Moosavi-Dezfooli et al., 2015)
- C&W Attack (Carlini and Wagner, 2016)
- NewtonFool (Jang et al., 2017)
The following defense methods are also supported:
- Feature squeezing (Xu et al., 2017)
- Spatial smoothing (Xu et al., 2017)
- Label smoothing (Warde-Farley and Goodfellow, 2016)
- Adversarial training (Szegedy et al., 2013)
- Virtual adversarial training (Miyato et al., 2017)
The details of the work from IBM research can be found in the research paper. The ART toolbox is developed with the goal of helping developers better understand
- Measuring model robustness
- Model hardening
- Runtime detection
Changelog v1.5.2
Added
- Added new method
reset_patch
toart.attacks.evasion.adversarial_patch.*
to reset patch (#863) - Added passing
kwargs
to internal attacks ofart.attacks.evasion.AutoAttack
(#850) - Added
art.estimators.classification.BlackBoxClassifierNeuralNetwork
as black-box classifier for neural network models (#849) - Added support for
channels_first=False
forart.attacks.evasion.ShadowAttack
in PyTorch (#848)
Changed
- Changed Numpy requirements to be less strict to resolve conflicts in dependencies (#879)
- Changed estimator requirements for
art.attacks.evasion.SquareAttack
andart.attacks.evasion.SimBA
to includeNeuralNetworkMixin
requiring neural network models (#849)
Fixed
- Fixed
BaseEstimator.set_params
to setpreprocessing
andpreprocessing_defences
correctly by accounting forart.preprocessing.standardisation_mean_std
(#901) - Fixed support for CUDA in
art.attacks.inference.membership_inference.MembershipInferenceBlackBox.infer
(#899) - Fixed return in
art.preprocessing.standardisation_mean_std.StandardisationMeanStdPyTorch
to maintain correct dtype (#890) - Fixed type conversion in
art.evaluations.security_curve.SecurityCurve
to be explicit (#886) - Fixed dtype in
art.attacks.evasion.SquareAttack
fornorm=2
to maintain correct type (#877) - Fixed missing
CarliniWagnerASR
inart.attacks.evasion
namespace (#873) - Fixed support for CUDA i `art.estimators.classification.PyTorchClassifier.loss (#862)
- Fixed bug in
art.attacks.evasion.AutoProjectedGradientDescent
for targeted attack to correctly detect successful iteration steps and added robust stopping criteria if loss becomes zero (#860) - Fixed bug in initialisation of search space in
art.attacks.evasion.SaliencyMapMethod
(#843) - Fixed bug in support for video data in
art.attacks.evasion.adversarial_patch.AdversarialPatchNumpy
(#838) - Fixed bug in logged success rate of
art.attacks.evasion.ProjectedGradientDescentPyTorch
andart.attacks.evasion.ProjectedGradientDescentTensorFlowV2
to use correct labels (#833)
Copyright (C) IBM Corporation 2018