This project is trained with two datasets, the MNIST dataset, and the Anime faces dataset. The VAE for MNIST is first converted from grayscale to binary, then trained with Binary Cross Entropy as its loss function. The Anime faces dataset is normalized to 0~1, and trained with Mean Square Error as its loss function.
Dataset | Fake Images | Interpolation between 4 latent codes |
---|---|---|
MNIST | ||
Anime |
MNIST dataset: https://storage.googleapis.com/tensorflow/tf-keras-datasets/mnist.npz
Anime Faces Dataset: https://www.kaggle.com/soumikrakshit/anime-faces
The root folder should be structured as follows:
📁 root/
├─ 📁 dataset/
| ├─ 📚 mnist.npz
| └─ 📚 archive.zip
├─ 📄 train_anime.py
└─ 📄 train_mnist.py
Original Anime Dataset Source: https://github.com/bchao1/Anime-Face-Dataset
matplotlib==3.5.1
numpy==1.22.2
Pillow==9.0.1
torch==1.10.2+cu102
torchvision==0.11.3+cu102
tqdm==4.62.3
zipp==3.7.0
Run the following code to train with MNIST dataset
python train_mnist.py
Run the following code to train with the anime dataset
python train_anime.py
By default, the scripts should output training results and synthesized images in a results
folder.
Global parameters can be tinkered in the script:
PATH_ZIP = "path/to/dataset.zip"
DIR_OUT = "output/image/directory"
EPOCHS # epochs
LR # learning rate
BATCH_SIZE # batch size
SPLIT_PERCENT # Percantage of the dataset (0~1) to be split for training and testing
LOG_INT # Interval for outputting testing images
LAMBDA # Kullback-Leiblier (KL) multiplier λ
LAT_DIM # Latent space dimension size