Skip to content

gyoisamurai/Adversarial-Threat-Detector

Repository files navigation

Adversarial-Threat-Detector

Japanese page

Topics

04.2021: We released the Web interface called "GyoiBoard" for ATD. More Information
03.2021: We'll present a Adversarial Threat Detector at Black Hat ASIA 2021 Arsenal.
More information

Adversarial Threat Detector makes AI development Secure.

Demo movie on youtube

In recent years, deep learning technology has been developing, and various systems using deep learning are spreading in our society, such as face recognition, security cameras (anomaly detection), and ADAS (Advanced Driver-Assistance Systems).

On the other hand, there are many attacks that exploit vulnerabilities in deep learning algorithms. For example, the Evasion Attacks are an attack that causes the target classifier to misclassify the Adversarial Examples into the class intended by the adversary. the Exfiltration Attacks are an attack that steals the parameters and train data of a target classifier. If your system is vulnerable to these attacks, it can lead to serious incidents such as face recognition being breached, allowing unauthorized intrusion, or information leakage due to inference of train data.

So we released a vulnerability scanner called "Adversarial Threat Detector" (a.k.a. ATD), which automatically detects vulnerabilities in deep learning based classifiers.

ATD contributes to the security of your classifier by executing the four cycles of "Detecting vulnerabilities (Scanning & Detection)", "Understanding vulnerabilities (Understanding)", "Fixing vulnerabilities (Fix)", and "Check fixed vulnerabilities (Re-Scanning)".

ATD is following the Adversarial Threat Matrix, which summarizes threats to machine learning systems. And currently, ATD uses the Adversarial Robustness Toolbox (ART), a security library for machine learning, as its core engine. Currently ATD is beta version, but we will release new functions once a month, so please check our release information.

ATD's secure cycle.

1. Detecting vulnerabilities(Scanning & Detection)

ATD automatically executes a variety of attacks against the classifier and detects vulnerabilities.

Automatic vulnerability detection.

2. Understanding vulnerabilities (Understanding)

When a vulnerability is detected, ATD will generate a countermeasure report (HTML style) and a replay environment (ipynb style) of the vulnerabilities. Developers can understand the vulnerabilities by referring to the countermeasure report and the replay environment.

  • Countermeasure report (HTML style)
Report of HTML style.

Developers can fix the vulnerabilities by referring to the vulnerability overview and countermeasures.
Sample report is here.

  • Vulnerabilities Replay environment (ipynb style)
Report of Jupyter Notebook style.

By opening the ipynb automatically generated by ATD in Jupyter Notebook so on, developers can replay the attack against the classifier. Developers can understand the vulnerabilities.
Sample notebook is here.

3. Fixing vulnerabilities (Fix)

ATD automatically fixes detected vulnerabilities.
Current ATD supports the Adversarial Training, which is one of the defense methods against evasive attacks.
* Other defense methods will be supported in the future.

4. Check fixed vulnerabilities (Re-Scanning)

The ATD checks fixed vulnerabilities of the fixed classifier.

Support

Classifier type.

The current version of ATD supports only image classifier built with tf.keras.
Other classifiers will be supported in the future.

Estimators Image classifier Text classifier Other classifier
Keras supported - -
TensorFlow - - -
TensorFlow v2 - - -
PyTorch - - -
Scikit-learn - - -

Attack type.

ATD supports only Evasion Attack.
Other attacks will be supported in the future.

Attack type Image classifier Text classifier Other classifier
Data Poisoning on going - -
Model Poisoning - - -
Evasion supported - -
Exfiltration on going - -

Road Map

We will be releasing new features of ATD every other month.
The roadmap for ATD is as follows.

  • 01.2021: Implementation of Evasion Attacks (completed).
  • 02.2021: Implementation of "Fix" and "Re-Scanning" functions (completed).
  • 05.2021: Implementation of Exfiltration Attacks.
  • 06.2021: Implementation of Detecting Data Poisoning.
  • 07.2021: Implementation of Detecting Model Poisoning.
  • 08.2021~: Support for other than Keras / non-image classifiers.

Installation

  1. git clone ATD's repository.
root@kali:~# git clone https://github.com/gyoisamurai/Adversarial-Threat-Detector
  1. Get python3-pip.
root@kali:~# apt-get update
root@kali:~# apt-get install python3-pip
  1. Install required python packages for ATD.
root@kali:~# cd Adversarial-Threat-Detector
root@kali:~/Adversarial-Threat-Detector# pip3 install -r requirements.txt

Usage

You can execute various vulnerability scans by changing the arguments to the ATD.

Note
The current version of ATD only supports Evasion Attacks. Other attack methods will be supported in the future.
usage: atd.py [-h] [--target_id TARGET_ID] [--scan_id SCAN_ID] [--model_name MODEL_NAME]
              [--train_data_name TRAIN_DATA_NAME] [--test_data_name TEST_DATA_NAME]
              [--use_x_train_num USE_X_TRAIN_NUM] [--use_x_test_num USE_X_TEST_NUM]
              [--train_label_name TRAIN_LABEL_NAME] [--test_label_name TEST_LABEL_NAME]
              [--op_type {attack,defence,test}]
              [--attack_type {data_poisoning,model_poisoning,evasion,exfiltration}]
              [--attack_data_poisoning {feature_collision,convex_polytope,bullseye_polytope}]
              [--attack_model_poisoning {node_injection,layer_injection}]
              [--attack_evasion {fgsm,cnw,jsma}] [--fgsm_epsilon {0.01,0.05,0.1,0.15,0.2,0.25,0.3}]
              [--fgsm_eps_step {0.1,0.2,0.3,0.4,0.5}] [--fgsm_targeted]
              [--fgsm_batch_size FGSM_BATCH_SIZE]
              [--cnw_confidence {0.0,0.1,0.2,0.3,0.4,0.5,0.6,0.7,0.8,0.9,1.0}]
              [--cnw_batch_size CNW_BATCH_SIZE]
              [--jsma_theta {0.1,0.2,0.3,0.4,0.5,0.6,0.7,0.8,0.9,1.0}]
              [--jsma_gamma {0.1,0.2,0.3,0.4,0.5,0.6,0.7,0.8,0.9,1.0}]
              [--jsma_batch_size JSMA_BATCH_SIZE]
              [--attack_exfiltration {membership_inference,label_only,inversion}]
              [--defence_type {data_poisoning,model_poisoning,evasion,exfiltration}]
              [--defence_evasion {adversarial_training,feature_squeezing,jpeg_compression}]
              [--adversarial_training_attack {fgsm,cnw,jsma}]
              [--adversarial_training_ratio {0.1,0.2,0.3,0.4,0.5,0.6,0.7}]
              [--adversarial_training_batch_size {32,64,128,256,512}]
              [--adversarial_training_epochs {10,20,30,40,50}] [--adversarial_training_shuffle]
              [--lang {en,ja}]

Adversarial Threat Detector.

optional arguments:
  -h, --help            show this help message and exit
  --target_id TARGET_ID
                        Target's identifier for GyoiBoard.
  --scan_id SCAN_ID     Scan's identifier for GyoiBoard.
  --model_name MODEL_NAME
                        Target model name.
  --train_data_name TRAIN_DATA_NAME
                        Training dataset name.
  --test_data_name TEST_DATA_NAME
                        Test dataset name.
  --use_x_train_num USE_X_TRAIN_NUM
                        Dataset number for X_train.
  --use_x_test_num USE_X_TEST_NUM
                        Dataset number for X_test.
  --train_label_name TRAIN_LABEL_NAME
                        Train label name.
  --test_label_name TEST_LABEL_NAME
                        Test label name.
  --op_type {attack,defence,test}
                        operation type.
  --attack_type {data_poisoning,model_poisoning,evasion,exfiltration}
                        Specify attack type.
  --attack_data_poisoning {feature_collision,convex_polytope,bullseye_polytope}
                        Specify method of Data Poisoning Attack.
  --attack_model_poisoning {node_injection,layer_injection}
                        Specify method of Poisoning Attack.
  --attack_evasion {fgsm,cnw,jsma}
                        Specify method of Evasion Attack.
  --fgsm_epsilon {0.01,0.05,0.1,0.15,0.2,0.25,0.3}
                        Specify Epsilon for FGSM.
  --fgsm_eps_step {0.1,0.2,0.3,0.4,0.5}
                        Specify Epsilon step for FGSM.
  --fgsm_targeted       Specify targeted evasion for FGSM.
  --fgsm_batch_size FGSM_BATCH_SIZE
                        Specify batch size for FGSM.
  --cnw_confidence {0.0,0.1,0.2,0.3,0.4,0.5,0.6,0.7,0.8,0.9,1.0}
                        Specify Confidence for C&W.
  --cnw_batch_size CNW_BATCH_SIZE
                        Specify batch size for CnW.
  --jsma_theta {0.1,0.2,0.3,0.4,0.5,0.6,0.7,0.8,0.9,1.0}
                        Specify Theta for JSMA.
  --jsma_gamma {0.1,0.2,0.3,0.4,0.5,0.6,0.7,0.8,0.9,1.0}
                        Specify Gamma for JSMA.
  --jsma_batch_size JSMA_BATCH_SIZE
                        Specify batch size for JSMA.
  --attack_exfiltration {membership_inference,label_only,inversion}
                        Specify method of Exfiltration Attack.
  --defence_type {data_poisoning,model_poisoning,evasion,exfiltration}
                        Specify defence type.
  --defence_evasion {adversarial_training,feature_squeezing,jpeg_compression}
                        Specify defence method against Evasion Attack.
  --adversarial_training_attack {fgsm,cnw,jsma}
                        Specify attack method for Adversarial Training.
  --adversarial_training_ratio {0.1,0.2,0.3,0.4,0.5,0.6,0.7}
                        Specify ratio for Adversarial Training.
  --adversarial_training_batch_size {32,64,128,256,512}
                        Specify batch size for Adversarial Training.
  --adversarial_training_epochs {10,20,30,40,50}
                        Specify epochs for Adversarial Training.
  --adversarial_training_shuffle
                        Specify shuffle for Adversarial Training.
  --lang {en,ja}        Specify language of report.

Tutorial

Attacks

Evasion Attack (FGSM).
root@kali:~/Adversarial-Threat-Detector# python3 atd.py --op_type attack --model_name "[path]"/model.h5 --test_data_name "[path]"/X_test.npz --test_label_name "[path]"/y_test.npz --use_x_test_num 100 --attack_type evasion --attack_evasion fgsm --fgsm_epsilon 0.05
Evasion Attack (C&W).
root@kali:~/Adversarial-Threat-Detector# python3 atd.py --op_type attack --model_name "[path]"/model.h5 --test_data_name "[path]"/X_test.npz --test_label_name "[path]"/y_test.npz --use_x_test_num 100 --attack_type evasion --attack_evasion cnw
Evasion Attack (JSMA).
root@kali:~/Adversarial-Threat-Detector# python3 atd.py --op_type attack --model_name "[path]"/model.h5 --test_data_name "[path]"/X_test.npz --test_label_name "[path]"/y_test.npz --use_x_test_num 100 --attack_type evasion --attack_evasion jsma

Defense

Adversarial Training
root@kali:~/Adversarial-Threat-Detector# python3 atd.py --op_type defence --model_name "[path]"/model.h5 --train_data_name "[path]"/X_train.npz --train_label_name "[path]"/y_train.npz --test_data_name "[path]"/X_test.npz --test_label_name "[path]"/y_test.npz --use_x_train_num 50000 --use_x_test_num 10000 --defence_type evasion --defence_evasion adversarial_training --adversarial_training_attack fgsm --adversarial_training_ratio 0.5 --adversarial_training_batch_size 128 --adversarial_training_epochs 10 --adversarial_training_shuffle

Demo

You can run ATD by using trained model and dataset we have prepared.

  1. Download the trained image classifier built in tf.keras.
root@kali:~/Adversarial-Threat-Detector# wget "https://drive.google.com/uc?export=download&id=1zFNn8EBHR_xewFW3-IhXkfdEop0gYUbu" -O demo_model.h5
  1. Download the CIFAR10.
root@kali:~/Adversarial-Threat-Detector# wget "https://drive.google.com/uc?export=download&id=1zJyB4zUDK22oU55rwTdbKMy0p3rOfuBw" -O X_test.npz
root@kali:~/Adversarial-Threat-Detector# wget "https://drive.google.com/uc?export=download&id=1SUuXdebgMjUOMT8I-e5vFC8oMIVcDz5P" -O y_test.npz
  1. Move demo_model.h5, X_test.npz, y_test.npz to targets directory.
root@kali:~/Adversarial-Threat-Detector# mv demo_model.h5 X_test.npz y -O X_test.npz y_test.npz ./targets/
  1. Run ATD.
root@kali:~/Adversarial-Threat-Detector# python3 atd.py --op_type attack --model_name ./targets/model.h5 --test_data_name ./targets/X_test.npz --test_label_name ./targets/y_test.npz --use_x_test_num 100 --attack_type evasion --attack_evasion fgsm --fgsm_epsilon 0.05
..snip..
[!] Created report: ~/Adversarial-Threat-Detector/reports/../reports/20210217151416_scan/scan_report.html
atd.py Done!!
  1. Check scan report (html and ipynb).
root@kali:~/Adversarial-Threat-Detector# firefox reports/20210217151416_scan/scan_report.html
root@kali:~/Adversarial-Threat-Detector# jupyter notebook reports/20210217151416_scan/evasion_fgsm.ipynb

Licence

MIT License

Contact us