Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FFHQ Training #40

Open
KomputerMaster64 opened this issue Sep 19, 2022 · 0 comments
Open

FFHQ Training #40

KomputerMaster64 opened this issue Sep 19, 2022 · 0 comments

Comments

@KomputerMaster64
Copy link

Thank you for sharing the implementation of the DDGAN model. I am trying to train the model on FFHQ 256x256 dataset. I used the NVLabs/NVAE repository for the dataset preparation. I have the file structure as follows:

image

To use another dataset similar to the CelebA-HQ 256x256, I modified the train function given in the line 190 of the train_ddgan.py file.

    elif args.dataset == 'ffhq_256':
        train_transform = transforms.Compose([
                transforms.Resize(args.image_size),
                transforms.RandomHorizontalFlip(),
                transforms.ToTensor(),
                transforms.Normalize((0.5,0.5,0.5), (0.5,0.5,0.5))
            ])
        dataset = LMDBDataset(root='/datasets/ffhq-lmdb/', name='ffhq', train=True, transform=train_transform)


My implementation for the DDGAN uses 4 NVIDIA GTX 1080ti GPUs with a total batch size of 32 for training the CelebA-HQ 256x256 dataset

(--batch_size 8 and --num_process_per_node 4)

I use the following command for training!python3 train_ddgan.py --dataset ffhq_256 --image_size 256 --exp ddgan_celebahq_exp1 --num_channels 3 --num_channels_dae 64 --ch_mult 1 1 2 2 4 4 --num_timesteps 2 --num_res_blocks 2 --batch_size 8 --num_epoch 800 --ngf 64 --embedding_type positional --use_ema --r1_gamma 2. --z_emb_dim 256 --lr_d 1e-4 --lr_g 2e-4 --lazy_reg 10 --num_process_per_node 4 --save_content



I am getting the following output message:

Node rank 0, local proc 0, global proc 0
Node rank 0, local proc 1, global proc 1
Node rank 0, local proc 2, global proc 2
Node rank 0, local proc 3, global proc 3
Process Process-4:
Traceback (most recent call last):
  File "/usr/local/apps/python-3.8.3/lib/python3.8/multiprocessing/process.py", line 315, in _bootstrap
    self.run()
  File "/usr/local/apps/python-3.8.3/lib/python3.8/multiprocessing/process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
  File "train_ddgan.py", line 482, in init_processes
    fn(rank, gpu, args)
  File "train_ddgan.py", line 248, in train
    dataset = LMDBDataset(root='/datasets/ffhq-lmdb/', name='ffhq', train=True, transform=train_transform)
  File "/home/manisha.padala/gan/denoising-diffusion-gan/datasets_prep/lmdb_datasets.py", line 33, in __init__
    self.data_lmdb = lmdb.open(lmdb_path, readonly=True, max_readers=1,
lmdb.Error: /datasets/ffhq-lmdb/train.lmdb: No such file or directory
Process Process-2:
Traceback (most recent call last):
  File "/usr/local/apps/python-3.8.3/lib/python3.8/multiprocessing/process.py", line 315, in _bootstrap
    self.run()
  File "/usr/local/apps/python-3.8.3/lib/python3.8/multiprocessing/process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
  File "train_ddgan.py", line 482, in init_processes
    fn(rank, gpu, args)
  File "train_ddgan.py", line 248, in train
    dataset = LMDBDataset(root='/datasets/ffhq-lmdb/', name='ffhq', train=True, transform=train_transform)
  File "/home/manisha.padala/gan/denoising-diffusion-gan/datasets_prep/lmdb_datasets.py", line 33, in __init__
    self.data_lmdb = lmdb.open(lmdb_path, readonly=True, max_readers=1,
lmdb.Error: /datasets/ffhq-lmdb/train.lmdb: No such file or directory
Process Process-1:
Traceback (most recent call last):
  File "/usr/local/apps/python-3.8.3/lib/python3.8/multiprocessing/process.py", line 315, in _bootstrap
    self.run()
  File "/usr/local/apps/python-3.8.3/lib/python3.8/multiprocessing/process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
  File "train_ddgan.py", line 482, in init_processes
    fn(rank, gpu, args)
  File "train_ddgan.py", line 248, in train
    dataset = LMDBDataset(root='/datasets/ffhq-lmdb/', name='ffhq', train=True, transform=train_transform)
  File "/home/manisha.padala/gan/denoising-diffusion-gan/datasets_prep/lmdb_datasets.py", line 33, in __init__
    self.data_lmdb = lmdb.open(lmdb_path, readonly=True, max_readers=1,
lmdb.Error: /datasets/ffhq-lmdb/train.lmdb: No such file or directory
Process Process-3:
Traceback (most recent call last):
  File "/usr/local/apps/python-3.8.3/lib/python3.8/multiprocessing/process.py", line 315, in _bootstrap
    self.run()
  File "/usr/local/apps/python-3.8.3/lib/python3.8/multiprocessing/process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
  File "train_ddgan.py", line 482, in init_processes
    fn(rank, gpu, args)
  File "train_ddgan.py", line 248, in train
    dataset = LMDBDataset(root='/datasets/ffhq-lmdb/', name='ffhq', train=True, transform=train_transform)
  File "/home/manisha.padala/gan/denoising-diffusion-gan/datasets_prep/lmdb_datasets.py", line 33, in __init__
    self.data_lmdb = lmdb.open(lmdb_path, readonly=True, max_readers=1,
lmdb.Error: /datasets/ffhq-lmdb/train.lmdb: No such file or directory
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant