YouTube Face DataSet Backdoor Detection

This project is a solution for the instructions/challenge mentioned in CSAW-HackML-2020. This is a part of the final project for the course ECE-GY 9163 at NYU.

Running the prediction on test data.

python eval.py data/clean_test_data.h5 models/sunglasses_bd_net.h5 models/VAE.h5

Background for the VAE Method

Autoencoder is a type of Neural Network that is trained to copy its input to the output. For example, given an image of a handwritten digit, an autoencoder first encodes the image into a lower dimensional latent representation, then decodes the latent representation back to an image. An autoencoder learns to compress the data to learn only the important features which are needed to predict a valid output successfully. A Variational Autoencoder encodes images belonging to the same class as E1 whereas an autoencoder would have encoded different images of the same class as E1 and E2. The difference arises when we add the mean and the variance dense layers as the innermost layers in our system which keep all the classes (1283) around 0.

We leverage this property of variational autoencoders to make it learn only the features from the clean validation dataset. Then we compare the reconstruction cost of the original training data with the input data. In the case of the poisoned data we see that the reconstruction loss is more than the clean image data. This is because the model learned only the features from the clean data. This makes us distinguish between a poisoned image and a clean image. We can see in the following images the difference between two sample poisoned images, i.e., sunglasses and eyebrows and their corresponding reconstructed images from the vae model.

The Autoencoder used here is a deep net with 1000 (based on experimentation) latent dimensions and it is trained using the Mean Absolute Error loss function and the Adam optimizer. The determiner for the poisoned (outlier) is a threshold which is chosen on the clean validation data as follows. We take the mean of the loss and then select the threshold as one value above the standard deviation.

Reconstruction_Loss_Threshold_X = Mean_X + S.T.D_X

This loss value comes out to be roughly in the range of 0.08 - 0.10. When compared with the reconstruction loss for the poisoned data set, we can see that it is distinctively low as mentioned in the table below.

Clean Data Threshold - 0.097 
Poisoned Sunglasses - 0.19903521 
Anonymous Data - 0.10617878  
Multi Trigger Sunglasses - 0.19900116 
Multi Trigger Eyebrows - 0.11117971  
Multi Trigger Lipstick - 0.106178075

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
.idea		.idea
models		models
ML_Final_Project.ipynb		ML_Final_Project.ipynb
README.md		README.md
eval.py		eval.py
notebook.ipynb		notebook.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

YouTube Face DataSet Backdoor Detection

Running the prediction on test data.

Background for the VAE Method

References

About

Releases

Packages

Languages

rishiraj824/YouTube-FaceDataSet-Backdoor-Detection

Folders and files

Latest commit

History

Repository files navigation

YouTube Face DataSet Backdoor Detection

Running the prediction on test data.

Background for the VAE Method

References

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages