CelebA 64 #5

AlexZhurkevich · 2020-09-23T17:27:11Z

First of all I would like to say thank you for the great VAE implementation. Looking forward to Celeb256 training instructions!
I tried preprocessing CelebA64 but got an error when executing create_celeba64_lmdb.py. The way you obtain the dataset (dset.celeba.CelebA) downloads corrupted img_align_celeba.zip, I was able to fix the problem by replacing 'dset.celeba.CelebA' with 'dset.CelebA'. According to the official API this is correct way to do it (https://pytorch.org/docs/stable/torchvision/datasets.html#celeba).

arash-vahdat · 2020-09-24T06:17:32Z

Did the error happen when you were running the actual training script or when you were running create_celeba64_lmdb.py?

Several users reported issues with celeba_64 download as well. They noticed the issue during training though:
#2

AlexZhurkevich · 2020-09-24T13:33:44Z

Yes, script, error happened when I ran 'python create_celeba64_lmdb.py --split train --img_path $DATA_DIR/celeba_org --lmdb_path $DATA_DIR/celeba64_lmdb', I simply did not find '.celeba.CelebA' in PyTorch APIs, hence the replacement. I've checked out issue #2 , I dont get any NaNs, training is going as expected for me. For training I used your default command, did not change anything since I have 8 V100 32gb, so no need to reduce batch size.
Arash, thanks for the great job that you've already done. I would like to ask you to consider explaining on the main page how different train.py parameters affect each other and/or training and/or results. For example in my case I would like to run NVAE on 512x512 custom dataset, I guess just resolution scale is not enough, I totally understand that a lot of these are hyperparameters, hence its up to us to decide and test, but at least for some of the obvious ones would be nice to have hints. This is one of reasons I am waiting for CelebA HQ 256 training parameters, it would be very beneficial to see the differences with CelebA 64 training.
Thanks!

Lukelluke · 2020-09-27T14:50:53Z

Yes, script, error happened when I ran 'python create_celeba64_lmdb.py --split train --img_path $DATA_DIR/celeba_org --lmdb_path $DATA_DIR/celeba64_lmdb', I simply did not find '.celeba.CelebA' in PyTorch APIs, hence the replacement. I've checked out issue #2 , I dont get any NaNs, training is going as expected for me. For training I used your default command, did not change anything since I have 8 V100 32gb, so no need to reduce batch size.
Arash, thanks for the great job that you've already done. I would like to ask you to consider explaining on the main page how different train.py parameters affect each other and/or training and/or results. For example in my case I would like to run NVAE on 512x512 custom dataset, I guess just resolution scale is not enough, I totally understand that a lot of these are hyperparameters, hence its up to us to decide and test, but at least for some of the obvious ones would be nice to have hints. This is one of reasons I am waiting for CelebA HQ 256 training parameters, it would be very beneficial to see the differences with CelebA 64 training.
Thanks!

Good Question！

We wondering this too. Thought that i tried HQ dataset 256 * 256 in almost the same hyperparameters, and got good result within 5 epochs. And i was confused that why Dr.@arash-vahdat didn't explain the suggested hyperparameters . :) lol

Expecting Dr.arash-vahdat will have free time to chat with us about this issue. :) 👍

AlexZhurkevich closed this as completed Sep 28, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CelebA 64 #5

CelebA 64 #5

AlexZhurkevich commented Sep 23, 2020

arash-vahdat commented Sep 24, 2020

AlexZhurkevich commented Sep 24, 2020

Lukelluke commented Sep 27, 2020

CelebA 64 #5

CelebA 64 #5

Comments

AlexZhurkevich commented Sep 23, 2020

arash-vahdat commented Sep 24, 2020

AlexZhurkevich commented Sep 24, 2020

Lukelluke commented Sep 27, 2020