Arbitrary-Style-Per-Model fast neural style transfer has shown great potential in the academic field. Although state-of-the-art algorithms have great visual effect and efficiency, they are unable to address the blank-leaving (or void) information in specific artworks (e.g. traditional Chinese artworks). The available algorithms always try to maintain the similarity of details in the images before and after transformation, but certain details are often left blank in the artworks.
This is my final year project, which aims to utilize the style attention map to learn the voidness information during the style transfer process. The main contributions of this project are a novel self-attention algorithm to extract the voidness information in the content and style image, and a novel style transfer module guided by the attention mask to swap the style.
-
Environment: Ubuntu 20.04, NVIDIA GeForce GTX 1080 TI
conda env create -f env.yml conda activate Sava
-
Download the datasets
-
Clone this repository
git clone https://github.com/dehezhang2/Final_Year_Project.git cd Final_Year_Project
-
Prepare your content image and style image, and save the content image to
./testing_data/content
the style iamge to./testing_data/style
. I also provide some in these two directories. -
Open the graphic user interface
-
Choose the content and style images
-
Click the
Start Transfer
button, and the attention maps, attention masks, and the relative frequency map of the content and style images will be visualised. The output will be shown. -
You can find the transfer output and attention maps in
./testing_data/result
. -
Feel free to add more images to the
./testing_data/content/
and./testing_data/style/
folder to explore the result!
-
Clone this repository
git clone https://github.com/dehezhang2/Final_Year_Project.git cd Final_Year_Project
-
Download the training datasets, and change the file structure
- All the content image should be in the directory
./training_data/content_set/val2014
- All the style image should be in the directory
./training_data/style_set/val2014
- All the content image should be in the directory
-
Filter the images by using two python files
cd ./codes/data_preprocess/ python filter.py python filter_percentage.py
-
We have two training phases:
- Phase I training: train the self-attention module
cd ./codes/transfer/ python train_attn.py --dataset_dir ../../training_data/content_set/val2014
- Phase II training: train the style transfer module
python train_sava.py --content_dir ../../training_data/content_set/val2014 --style_dir ../../training_data/style_set/val2014 --save_dir ../../models/sava_training_hard
Here is a comparison of self-attention map used in AAMS (a) and our result (b)
Some results of content-style pairs are shown below (a) is our algorithm with attention masks, (b) is SA-Net:
Although we have two contributions on the style transfer theory, there are limitations for this project:
- Principle of some settings cannot be well explained by theory.
- Feature map projection method (ZCA for attention map, AdaIN for style transfer)
- Method to train the self-attention module (similar to AAMS)
- The limitation of computational resource.
- The VGG decoder may not be properly trained.
- It is diffcult to add attention loss to match the statistics of the style and output attention maps.
- It is difficult to divide the attention map into more clusters
- I express gratitude to AAMS and SA-Net, we benefit a lot from both their papers and codes.
- Thanks to Dr. Jing LIAO. She has provided many insightful suggestions, such as the use of style attention, soft correlation mask, and attention loss to match the voidness statistics. I would like to express my sincere appreciation to Kaiwen Xue, who has provided many intelligent ideas on this project and helped me with part of the implementation.
If you have any questions or suggestions about this project, feel free to contact me by email dehezhang2@gmail.com.
The code is released under the GPL-3.0 license.