04.2021: We released the Web interface called "GyoiBoard" for ATD. More Information
03.2021: We'll present a Adversarial Threat Detector at Black Hat ASIA 2021 Arsenal.
More information
Adversarial Threat Detector makes AI development Secure.
In recent years, deep learning technology has been developing, and various systems using deep learning are spreading in our society, such as face recognition, security cameras (anomaly detection), and ADAS (Advanced Driver-Assistance Systems).
On the other hand, there are many attacks that exploit vulnerabilities in deep learning algorithms. For example, the Evasion Attacks are an attack that causes the target classifier to misclassify the Adversarial Examples into the class intended by the adversary. the Exfiltration Attacks are an attack that steals the parameters and train data of a target classifier. If your system is vulnerable to these attacks, it can lead to serious incidents such as face recognition being breached, allowing unauthorized intrusion, or information leakage due to inference of train data.
So we released a vulnerability scanner called "Adversarial Threat Detector" (a.k.a. ATD), which automatically detects vulnerabilities in deep learning based classifiers.
ATD contributes to the security of your classifier by executing the four cycles of "Detecting vulnerabilities (Scanning & Detection)", "Understanding vulnerabilities (Understanding)", "Fixing vulnerabilities (Fix)", and "Check fixed vulnerabilities (Re-Scanning)".
ATD is following the Adversarial Threat Matrix, which summarizes threats to machine learning systems. And currently, ATD uses the Adversarial Robustness Toolbox (ART), a security library for machine learning, as its core engine. Currently ATD is beta version, but we will release new functions once a month, so please check our release information.
ATD automatically executes a variety of attacks against the classifier and detects vulnerabilities.
When a vulnerability is detected, ATD will generate a countermeasure report (HTML style) and a replay environment (ipynb style) of the vulnerabilities. Developers can understand the vulnerabilities by referring to the countermeasure report and the replay environment.
- Countermeasure report (HTML style)
Developers can fix the vulnerabilities by referring to the vulnerability overview and countermeasures.
Sample report is here.
- Vulnerabilities Replay environment (ipynb style)
By opening the ipynb automatically generated by ATD in Jupyter Notebook so on, developers can replay the attack against the classifier. Developers can understand the vulnerabilities.
Sample notebook is here.
ATD automatically fixes detected vulnerabilities.
Current ATD supports the Adversarial Training, which is one of the defense methods against evasive attacks.
* Other defense methods will be supported in the future.
The ATD checks fixed vulnerabilities of the fixed classifier.
The current version of ATD supports only image classifier built with tf.keras
.
Other classifiers will be supported in the future.
Estimators | Image classifier | Text classifier | Other classifier |
---|---|---|---|
Keras | supported | - | - |
TensorFlow | - | - | - |
TensorFlow v2 | - | - | - |
PyTorch | - | - | - |
Scikit-learn | - | - | - |
ATD supports only Evasion Attack.
Other attacks will be supported in the future.
Attack type | Image classifier | Text classifier | Other classifier |
---|---|---|---|
Data Poisoning | on going | - | - |
Model Poisoning | - | - | - |
Evasion | supported | - | - |
Exfiltration | on going | - | - |
We will be releasing new features of ATD every other month.
The roadmap for ATD is as follows.
- 01.2021: Implementation of Evasion Attacks (completed).
- 02.2021: Implementation of "Fix" and "Re-Scanning" functions (completed).
- 05.2021: Implementation of Exfiltration Attacks.
- 06.2021: Implementation of Detecting Data Poisoning.
- 07.2021: Implementation of Detecting Model Poisoning.
- 08.2021~: Support for other than Keras / non-image classifiers.
- git clone ATD's repository.
root@kali:~# git clone https://github.com/gyoisamurai/Adversarial-Threat-Detector
- Get python3-pip.
root@kali:~# apt-get update
root@kali:~# apt-get install python3-pip
- Install required python packages for ATD.
root@kali:~# cd Adversarial-Threat-Detector
root@kali:~/Adversarial-Threat-Detector# pip3 install -r requirements.txt
You can execute various vulnerability scans by changing the arguments to the ATD.
Note |
---|
The current version of ATD only supports Evasion Attacks. Other attack methods will be supported in the future. |
usage: atd.py [-h] [--target_id TARGET_ID] [--scan_id SCAN_ID] [--model_name MODEL_NAME]
[--train_data_name TRAIN_DATA_NAME] [--test_data_name TEST_DATA_NAME]
[--use_x_train_num USE_X_TRAIN_NUM] [--use_x_test_num USE_X_TEST_NUM]
[--train_label_name TRAIN_LABEL_NAME] [--test_label_name TEST_LABEL_NAME]
[--op_type {attack,defence,test}]
[--attack_type {data_poisoning,model_poisoning,evasion,exfiltration}]
[--attack_data_poisoning {feature_collision,convex_polytope,bullseye_polytope}]
[--attack_model_poisoning {node_injection,layer_injection}]
[--attack_evasion {fgsm,cnw,jsma}] [--fgsm_epsilon {0.01,0.05,0.1,0.15,0.2,0.25,0.3}]
[--fgsm_eps_step {0.1,0.2,0.3,0.4,0.5}] [--fgsm_targeted]
[--fgsm_batch_size FGSM_BATCH_SIZE]
[--cnw_confidence {0.0,0.1,0.2,0.3,0.4,0.5,0.6,0.7,0.8,0.9,1.0}]
[--cnw_batch_size CNW_BATCH_SIZE]
[--jsma_theta {0.1,0.2,0.3,0.4,0.5,0.6,0.7,0.8,0.9,1.0}]
[--jsma_gamma {0.1,0.2,0.3,0.4,0.5,0.6,0.7,0.8,0.9,1.0}]
[--jsma_batch_size JSMA_BATCH_SIZE]
[--attack_exfiltration {membership_inference,label_only,inversion}]
[--defence_type {data_poisoning,model_poisoning,evasion,exfiltration}]
[--defence_evasion {adversarial_training,feature_squeezing,jpeg_compression}]
[--adversarial_training_attack {fgsm,cnw,jsma}]
[--adversarial_training_ratio {0.1,0.2,0.3,0.4,0.5,0.6,0.7}]
[--adversarial_training_batch_size {32,64,128,256,512}]
[--adversarial_training_epochs {10,20,30,40,50}] [--adversarial_training_shuffle]
[--lang {en,ja}]
Adversarial Threat Detector.
optional arguments:
-h, --help show this help message and exit
--target_id TARGET_ID
Target's identifier for GyoiBoard.
--scan_id SCAN_ID Scan's identifier for GyoiBoard.
--model_name MODEL_NAME
Target model name.
--train_data_name TRAIN_DATA_NAME
Training dataset name.
--test_data_name TEST_DATA_NAME
Test dataset name.
--use_x_train_num USE_X_TRAIN_NUM
Dataset number for X_train.
--use_x_test_num USE_X_TEST_NUM
Dataset number for X_test.
--train_label_name TRAIN_LABEL_NAME
Train label name.
--test_label_name TEST_LABEL_NAME
Test label name.
--op_type {attack,defence,test}
operation type.
--attack_type {data_poisoning,model_poisoning,evasion,exfiltration}
Specify attack type.
--attack_data_poisoning {feature_collision,convex_polytope,bullseye_polytope}
Specify method of Data Poisoning Attack.
--attack_model_poisoning {node_injection,layer_injection}
Specify method of Poisoning Attack.
--attack_evasion {fgsm,cnw,jsma}
Specify method of Evasion Attack.
--fgsm_epsilon {0.01,0.05,0.1,0.15,0.2,0.25,0.3}
Specify Epsilon for FGSM.
--fgsm_eps_step {0.1,0.2,0.3,0.4,0.5}
Specify Epsilon step for FGSM.
--fgsm_targeted Specify targeted evasion for FGSM.
--fgsm_batch_size FGSM_BATCH_SIZE
Specify batch size for FGSM.
--cnw_confidence {0.0,0.1,0.2,0.3,0.4,0.5,0.6,0.7,0.8,0.9,1.0}
Specify Confidence for C&W.
--cnw_batch_size CNW_BATCH_SIZE
Specify batch size for CnW.
--jsma_theta {0.1,0.2,0.3,0.4,0.5,0.6,0.7,0.8,0.9,1.0}
Specify Theta for JSMA.
--jsma_gamma {0.1,0.2,0.3,0.4,0.5,0.6,0.7,0.8,0.9,1.0}
Specify Gamma for JSMA.
--jsma_batch_size JSMA_BATCH_SIZE
Specify batch size for JSMA.
--attack_exfiltration {membership_inference,label_only,inversion}
Specify method of Exfiltration Attack.
--defence_type {data_poisoning,model_poisoning,evasion,exfiltration}
Specify defence type.
--defence_evasion {adversarial_training,feature_squeezing,jpeg_compression}
Specify defence method against Evasion Attack.
--adversarial_training_attack {fgsm,cnw,jsma}
Specify attack method for Adversarial Training.
--adversarial_training_ratio {0.1,0.2,0.3,0.4,0.5,0.6,0.7}
Specify ratio for Adversarial Training.
--adversarial_training_batch_size {32,64,128,256,512}
Specify batch size for Adversarial Training.
--adversarial_training_epochs {10,20,30,40,50}
Specify epochs for Adversarial Training.
--adversarial_training_shuffle
Specify shuffle for Adversarial Training.
--lang {en,ja} Specify language of report.
root@kali:~/Adversarial-Threat-Detector# python3 atd.py --op_type attack --model_name "[path]"/model.h5 --test_data_name "[path]"/X_test.npz --test_label_name "[path]"/y_test.npz --use_x_test_num 100 --attack_type evasion --attack_evasion fgsm --fgsm_epsilon 0.05
root@kali:~/Adversarial-Threat-Detector# python3 atd.py --op_type attack --model_name "[path]"/model.h5 --test_data_name "[path]"/X_test.npz --test_label_name "[path]"/y_test.npz --use_x_test_num 100 --attack_type evasion --attack_evasion cnw
root@kali:~/Adversarial-Threat-Detector# python3 atd.py --op_type attack --model_name "[path]"/model.h5 --test_data_name "[path]"/X_test.npz --test_label_name "[path]"/y_test.npz --use_x_test_num 100 --attack_type evasion --attack_evasion jsma
root@kali:~/Adversarial-Threat-Detector# python3 atd.py --op_type defence --model_name "[path]"/model.h5 --train_data_name "[path]"/X_train.npz --train_label_name "[path]"/y_train.npz --test_data_name "[path]"/X_test.npz --test_label_name "[path]"/y_test.npz --use_x_train_num 50000 --use_x_test_num 10000 --defence_type evasion --defence_evasion adversarial_training --adversarial_training_attack fgsm --adversarial_training_ratio 0.5 --adversarial_training_batch_size 128 --adversarial_training_epochs 10 --adversarial_training_shuffle
You can run ATD by using trained model and dataset we have prepared.
- Download the trained image classifier built in
tf.keras
.
root@kali:~/Adversarial-Threat-Detector# wget "https://drive.google.com/uc?export=download&id=1zFNn8EBHR_xewFW3-IhXkfdEop0gYUbu" -O demo_model.h5
- Download the CIFAR10.
root@kali:~/Adversarial-Threat-Detector# wget "https://drive.google.com/uc?export=download&id=1zJyB4zUDK22oU55rwTdbKMy0p3rOfuBw" -O X_test.npz
root@kali:~/Adversarial-Threat-Detector# wget "https://drive.google.com/uc?export=download&id=1SUuXdebgMjUOMT8I-e5vFC8oMIVcDz5P" -O y_test.npz
- Move
demo_model.h5
,X_test.npz
,y_test.npz
totargets
directory.
root@kali:~/Adversarial-Threat-Detector# mv demo_model.h5 X_test.npz y -O X_test.npz y_test.npz ./targets/
- Run ATD.
root@kali:~/Adversarial-Threat-Detector# python3 atd.py --op_type attack --model_name ./targets/model.h5 --test_data_name ./targets/X_test.npz --test_label_name ./targets/y_test.npz --use_x_test_num 100 --attack_type evasion --attack_evasion fgsm --fgsm_epsilon 0.05
..snip..
[!] Created report: ~/Adversarial-Threat-Detector/reports/../reports/20210217151416_scan/scan_report.html
atd.py Done!!
- Check scan report (html and ipynb).
root@kali:~/Adversarial-Threat-Detector# firefox reports/20210217151416_scan/scan_report.html
root@kali:~/Adversarial-Threat-Detector# jupyter notebook reports/20210217151416_scan/evasion_fgsm.ipynb
- Email
gyoiler3@gmail.com