Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CelebA 64 #5

Closed
AlexZhurkevich opened this issue Sep 23, 2020 · 3 comments
Closed

CelebA 64 #5

AlexZhurkevich opened this issue Sep 23, 2020 · 3 comments

Comments

@AlexZhurkevich
Copy link

First of all I would like to say thank you for the great VAE implementation. Looking forward to Celeb256 training instructions!
I tried preprocessing CelebA64 but got an error when executing create_celeba64_lmdb.py. The way you obtain the dataset (dset.celeba.CelebA) downloads corrupted img_align_celeba.zip, I was able to fix the problem by replacing 'dset.celeba.CelebA' with 'dset.CelebA'. According to the official API this is correct way to do it (https://pytorch.org/docs/stable/torchvision/datasets.html#celeba).

@arash-vahdat
Copy link
Contributor

Did the error happen when you were running the actual training script or when you were running create_celeba64_lmdb.py?

Several users reported issues with celeba_64 download as well. They noticed the issue during training though:
#2

@AlexZhurkevich
Copy link
Author

Yes, script, error happened when I ran 'python create_celeba64_lmdb.py --split train --img_path $DATA_DIR/celeba_org --lmdb_path $DATA_DIR/celeba64_lmdb', I simply did not find '.celeba.CelebA' in PyTorch APIs, hence the replacement. I've checked out issue #2 , I dont get any NaNs, training is going as expected for me. For training I used your default command, did not change anything since I have 8 V100 32gb, so no need to reduce batch size.
Arash, thanks for the great job that you've already done. I would like to ask you to consider explaining on the main page how different train.py parameters affect each other and/or training and/or results. For example in my case I would like to run NVAE on 512x512 custom dataset, I guess just resolution scale is not enough, I totally understand that a lot of these are hyperparameters, hence its up to us to decide and test, but at least for some of the obvious ones would be nice to have hints. This is one of reasons I am waiting for CelebA HQ 256 training parameters, it would be very beneficial to see the differences with CelebA 64 training.
Thanks!

@Lukelluke
Copy link

Yes, script, error happened when I ran 'python create_celeba64_lmdb.py --split train --img_path $DATA_DIR/celeba_org --lmdb_path $DATA_DIR/celeba64_lmdb', I simply did not find '.celeba.CelebA' in PyTorch APIs, hence the replacement. I've checked out issue #2 , I dont get any NaNs, training is going as expected for me. For training I used your default command, did not change anything since I have 8 V100 32gb, so no need to reduce batch size.
Arash, thanks for the great job that you've already done. I would like to ask you to consider explaining on the main page how different train.py parameters affect each other and/or training and/or results. For example in my case I would like to run NVAE on 512x512 custom dataset, I guess just resolution scale is not enough, I totally understand that a lot of these are hyperparameters, hence its up to us to decide and test, but at least for some of the obvious ones would be nice to have hints. This is one of reasons I am waiting for CelebA HQ 256 training parameters, it would be very beneficial to see the differences with CelebA 64 training.
Thanks!

Good Question!

We wondering this too. Thought that i tried HQ dataset 256 * 256 in almost the same hyperparameters, and got good result within 5 epochs. And i was confused that why Dr.@arash-vahdat didn't explain the suggested hyperparameters . :) lol

Expecting Dr.arash-vahdat will have free time to chat with us about this issue. :) 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants