faceswap-GAN

Adding Adversarial loss and perceptual loss (VGGface) to deepfakes(reddit user)' auto-encoder architecture.

Updates

Date	Update
2018-07-04	GAN training: Add the relativistic discriminator as an alternative option to the default mixup training method. Set `loss_config["gan_training"]="relativistic_avg_LSGAN"` in config cells to enable it.
2018-06-29	Model architecture: faceswap-GAN v2.2 now supports different output resolutions: 64x64, 128x128, and 256x256. Default `RESOLUTION = 64` can be changed in the config cell of v2.2 notebook.
2018-06-25	New version: faceswap-GAN v2.2 has been released. The main improvements of v2.2 model are its capability of generating realistic and consistent eye movements (results are shown below, or Ctrl+F for eyes), as well as higher video quality with face alignment.
2018-06-06	Model architecture: Add a self-attention mechanism proposed in SAGAN into V2 GAN model. (Note: There is still no official code release for SAGAN, the implementation in this repo. could be wrong. We'll keep an eye on it.)
2018-03-17	Training: V2 model now provides a 40000-iter training schedule which automatically switches to proper loss functions at predefined iterations. (Cage/Trump dataset results)
2018-03-13	Model architecture: V2.1 model now provides 3 base architectures: (i) XGAN, (ii) VAE-GAN, and (iii) a variant of v2 GAN. See "4. Training Phase Configuration" in v2.1 notebook for detail.

Descriptions

faceswap-GAN v2.2 (Recommended model)

FaceSwap_GAN_v2.2_train_test.ipynb
- Notebook for model training and video conversion of GAN model version 2.2.
- Additional training images generated from prep_binary_masks.ipynb are required.
- Supported output resolutions: 64x64, 128x128, and 256x256.
- Face alignment using 5-points landmarks is introduced to video conversion. The quality of output videos should be greatly improved.
- Not compatible with _test_video_MTCNN notebook (will make compatible in future updates).
FaceSwap_GAN_v2.2_video_conversion.ipynb
- Notebook for video conversion of GAN model version 2.2.
prep_binary_masks.ipynb
- Notebook for training data preprocessing. It generates binary masks for each training image.
- Require face_alignment package.

faceswap-GAN v2

FaceSwap_GAN_v2_train.ipynb
- Notebook for training the version 2 GAN model.
- Video conversion functions are also included.
FaceSwap_GAN_v2_test_video_MTCNN.ipynb
- Notebook for generating videos. Use MTCNN for face detection.
faceswap_WGAN-GP_keras_github.ipynb
- This notebook is an independent training script for a GAN model of WGAN-GP.
- Perceptual loss is discarded for simplicity.
- Not compatible with _test_video and _test_video_MTCNN notebooks above.
- The WGAN-GP model gave similar result with LSGAN model after tantamount (~18k) generator updates.
- Training can be start easily as the following:
```
gan = FaceSwapGAN() # instantiate the class
gan.train(max_iters=10e4, save_interval=500) # start training
```
FaceSwap_GAN_v2_sz128_train.ipynb
- This notebook is an independent script for a model with larger input/output resolution.
- Not compatible with _test_video and _test_video_MTCNN notebooks above.
- Input and output images have larger shape (128, 128, 3).
- Introduce minor updates to the architectures:
  1. Add instance normalization to the generators and discriminators.
  2. Add additional regressoin loss (mae loss) on 64x64 branch output.

Miscellaneous

dlib_video_face_detection.ipynb
- Detect/Crop faces in a video using dlib's cnn model.
- Pack cropped face images into a zip file.

Training data format

Face images are supposed to be in ./faceA/ or ./faceB/ folder for each taeget respectively.
Face images can be of any size.
For better generalization, source faces can also be from multiple people.

Generative adversarial networks for face swapping

1. Architecture

2. Results

Improved output quality: Adversarial loss improves reconstruction quality of generated images.
Additional results: This image shows 160 random results generated by v2 GAN with self-attention mechanism (image format: source -> mask -> transformed).
Consistent eye movements (v2.2 model): Results of the v2.2 model which specializes on eye direcitons are presented below. V2.2 model generated more realistic eyes within shorter iteations. (Input gifs are created using DeepWarp.)
- Top row: v2 model; Bottom row: v2.2 model

The Trump/Cage images are obtained from the reddit user deepfakes' project on pastebin.com.

3. Features

VGGFace perceptual loss: Perceptual loss improves direction of eyeballs to be more realistic and consistent with input face. It also smoothes out artifacts in the segmentation mask, resulting higher output quality.
Attention mask: Model predicts an attention mask that helps on handling occlusion, eliminating artifacts around edges, and producing natrual skin tone. In below are results transforming Hinako Sano (佐野ひなこ) to Emi Takei (武井咲).
- From left to right: source face, swapped face (before masking), swapped face (after masking).
- From left to right: source face, swapped face (after masking), mask heatmap.

Source video: 佐野ひなことすごくどうでもいい話？(遊戯王)

Configurable input/output resolution: The model supports 64x64, 128x128, and 256x256 outupt resolutions.
Face tracking/alignment using MTCNN and Kalman filter during video conversion:
- MTCNN provides more stable detections.
- Kalman filter is introduced to smoothen face bounding box positions over frames and eliminate jitter on the swapped face.
- Face alignment (FA) further stabilises the output results.
Training schedule: V2 model provides a predefined training schedule. The Trump/Cage results above are generated by model trained for 21k iters using TOTAL_ITERS = 30000 predefined training schedule.
- Training trick: Swapping the decoders in the late stage of training reduces artifacts caused by the extreme facial expressions. E.g., some of the failure cases (of results above) having their mouth open wide are better transformed using this trick.
Eyes-aware training: Introduce reconstruction loss and edge loss around eyes area, which guide the model to generate realistic eyes.

4. Experimental models

V2.1 model: An improved architecture is updated in order to stablize training. The architecture is greatly inspired by XGAN ~~and MS-D neural network~~. (Note: V2.1 script is experimental and not well-maintained)
- V2.1 model provides three base architectures: (i) XGAN, (ii) VAE-GAN, and (iii) a variant of v2 GAN. (default base_model="GAN")
- FCN8s for face segmentation is introduced to improve masking in video conversion (default use_FCN_mask = True).
  - To enable this feature, keras weights file should be generated through jupyter notebook provided in this repo.

Frequently asked questions and troubleshooting

1. How does it work?

The following illustration shows a very high-level and abstract (but not exactly the same) flowchart of the denoising autoencoder algorithm. The objective functions look like this.

2. No audio in output clips?

Set audio=True in the video making cell.

output = 'OUTPUT_VIDEO.mp4'
clip1 = VideoFileClip("INPUT_VIDEO.mp4")
clip = clip1.fl_image(process_video)
%time clip.write_videofile(output, audio=True) # Set audio=True

3. Previews look good, but it does not transform to the output videos?

Model performs its full potential when the input images contain less backgrund.
- Input images should better be preprocessed with face alignment methods.

Requirements

keras 2.1.5
Tensorflow 1.6.0
Python 3.6.4
OpenCV
keras-vggface
moviepy
prefetch_generator (required for v2.2 model)
face-alignment (required as preprocessing for v2.2 model)
dlib (optional)
face_recognition (optinoal)

Acknowledgments

Code borrows from tjwei, eriklindernoren, fchollet, keras-contrib and reddit user deepfakes' project. The generative network is adopted from CycleGAN. Weights and scripts of MTCNN are from FaceNet. Illustrations are from irasutoya.

Name		Name	Last commit message	Last commit date
Latest commit History 173 Commits
converter		converter
detector		detector
mtcnn_weights		mtcnn_weights
networks		networks
notes		notes
temp		temp
FCN8s_keras.py		FCN8s_keras.py
FaceSwap_GAN_github.ipynb		FaceSwap_GAN_github.ipynb
FaceSwap_GAN_v2.1_train.ipynb		FaceSwap_GAN_v2.1_train.ipynb
FaceSwap_GAN_v2.2_train_test.ipynb		FaceSwap_GAN_v2.2_train_test.ipynb
FaceSwap_GAN_v2.2_video_conversion.ipynb		FaceSwap_GAN_v2.2_video_conversion.ipynb
FaceSwap_GAN_v2_sz128_train.ipynb		FaceSwap_GAN_v2_sz128_train.ipynb
FaceSwap_GAN_v2_test_img.ipynb		FaceSwap_GAN_v2_test_img.ipynb
FaceSwap_GAN_v2_test_video_MTCNN.ipynb		FaceSwap_GAN_v2_test_video_MTCNN.ipynb
FaceSwap_GAN_v2_train.ipynb		FaceSwap_GAN_v2_train.ipynb
README.md		README.md
dlib_video_face_detection.ipynb		dlib_video_face_detection.ipynb
image_augmentation.py		image_augmentation.py
instance_normalization.py		instance_normalization.py
model_GAN_v2.py		model_GAN_v2.py
mtcnn_detect_face.py		mtcnn_detect_face.py
pixel_shuffler.py		pixel_shuffler.py
prep_binary_masks.ipynb		prep_binary_masks.ipynb
training_data.py		training_data.py
umeyama.py		umeyama.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

faceswap-GAN

Updates

Descriptions

faceswap-GAN v2.2 (Recommended model)

faceswap-GAN v2

Miscellaneous

Training data format

Generative adversarial networks for face swapping

1. Architecture

2. Results

The Trump/Cage images are obtained from the reddit user deepfakes' project on pastebin.com.

3. Features

Source video: 佐野ひなことすごくどうでもいい話？(遊戯王)

4. Experimental models

Frequently asked questions and troubleshooting

1. How does it work?

2. No audio in output clips?

3. Previews look good, but it does not transform to the output videos?

Requirements

Acknowledgments

About

Releases

Packages

Languages

unknownentity123/faceswap-GAN

Folders and files

Latest commit

History

Repository files navigation

faceswap-GAN

Updates

Descriptions

faceswap-GAN v2.2 (Recommended model)

faceswap-GAN v2

Miscellaneous

Training data format

Generative adversarial networks for face swapping

1. Architecture

2. Results

The Trump/Cage images are obtained from the reddit user deepfakes' project on pastebin.com.

3. Features

Source video: 佐野ひなことすごくどうでもいい話？(遊戯王)

4. Experimental models

Frequently asked questions and troubleshooting

1. How does it work?

2. No audio in output clips?

3. Previews look good, but it does not transform to the output videos?

Requirements

Acknowledgments

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages