About training grayscale images #635

rememberBr · 2024-02-02T04:56:50Z

Describe the bug
RuntimeError: output with shape [64, 1, 1, 1] doesn't match the broadcast shape [64, 3, 1, 1]

To Reproduce
Steps to reproduce the behavior:

In 'stylegan3' directory, run command 'python train.py --resume=https://api.ngc.nvidia.com/v2/models/nvidia/research/stylegan3/versions/1/files/stylegan3-r-ffhqu-256x256.pkl ....'
See error
Training options:
{
"G_kwargs": {
"class_name": "training.networks_stylegan3.Generator",
"z_dim": 512,
"w_dim": 512,
"mapping_kwargs": {
"num_layers": 2
},
"channel_base": 32768,
"channel_max": 1024,
"magnitude_ema_beta": 0.9977843871238888,
"conv_kernel": 1,
"use_radial_filters": true
},
"D_kwargs": {
"class_name": "training.networks_stylegan2.Discriminator",
"block_kwargs": {
"freeze_layers": 0
},
"mapping_kwargs": {},
"epilogue_kwargs": {
"mbstd_group_size": 4
},
"channel_base": 16384,
"channel_max": 512
},
"G_opt_kwargs": {
"class_name": "torch.optim.Adam",
"betas": [
0,
0.99
],
"eps": 1e-08,
"lr": 0.0025
},
"D_opt_kwargs": {
"class_name": "torch.optim.Adam",
"betas": [
0,
0.99
],
"eps": 1e-08,
"lr": 0.002
},
"loss_kwargs": {
"class_name": "training.loss.StyleGAN2Loss",
"r1_gamma": 2.0,
"blur_init_sigma": 0,
"blur_fade_kimg": 400.0
},
"data_loader_kwargs": {
"pin_memory": true,
"prefetch_factor": 2,
"num_workers": 3
},
"training_set_kwargs": {
"class_name": "training.dataset.ImageFolderDataset",
"path": "../Gan/data/data/256L",
"use_labels": false,
"max_size": 4813,
"xflip": true,
"resolution": 256,
"random_seed": 0
},
"num_gpus": 1,
"batch_size": 64,
"batch_gpu": 64,
"metrics": [],
"total_kimg": 25000,
"kimg_per_tick": 4,
"image_snapshot_ticks": 5,
"network_snapshot_ticks": 5,
"random_seed": 0,
"ema_kimg": 20.0,
"augment_kwargs": {
"class_name": "training.augment.AugmentPipe",
"xflip": 1,
"rotate90": 1,
"xint": 1,
"scale": 1,
"rotate": 1,
"aniso": 1,
"xfrac": 1,
"brightness": 1,
"contrast": 1,
"lumaflip": 1,
"hue": 1,
"saturation": 1
},
"ada_target": 0.6,
"resume_pkl": "https://api.ngc.nvidia.com/v2/models/nvidia/research/stylegan3/versions/1/files/stylegan3-r-ffhqu-256x256.pkl",
"ada_kimg": 100,
"ema_rampup": null,
"run_dir": "~/training-runs-rr/00000-stylegan3-r-256L-gpus1-batch64-gamma2"
}

Output directory: ~/training-runs-rr/00000-stylegan3-r-256L-gpus1-batch64-gamma2
Number of GPUs: 1
Batch size: 64 images
Training duration: 25000 kimg
Dataset path: ../Gan/data/data/256L
Dataset size: 4813 images
Dataset resolution: 256
Dataset labels: False
Dataset x-flips: True

Creating output directory...
Launching processes...
Loading training set...

Num images: 9626
Image shape: [1, 256, 256]
Label shape: [0]

Constructing networks...
Resuming from "https://api.ngc.nvidia.com/v2/models/nvidia/research/stylegan3/versions/1/files/stylegan3-r-ffhqu-256x256.pkl"
Traceback (most recent call last):
File "/home/bairu/workspace/DatasetGEN/styleGan3/train.py", line 288, in
main() # pylint: disable=no-value-for-parameter
File "/home/bairu/miniconda3/envs/stylegan3/lib/python3.9/site-packages/click/core.py", line 1157, in call
return self.main(*args, **kwargs)
File "/home/bairu/miniconda3/envs/stylegan3/lib/python3.9/site-packages/click/core.py", line 1078, in main
rv = self.invoke(ctx)
File "/home/bairu/miniconda3/envs/stylegan3/lib/python3.9/site-packages/click/core.py", line 1434, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/home/bairu/miniconda3/envs/stylegan3/lib/python3.9/site-packages/click/core.py", line 783, in invoke
return callback(*args, **kwargs)
File "/home/bairu/workspace/DatasetGEN/styleGan3/train.py", line 283, in main
launch_training(c=c, desc=desc, outdir=opts.outdir, dry_run=opts.dry_run)
File "/home/bairu/workspace/DatasetGEN/styleGan3/train.py", line 98, in launch_training
subprocess_fn(rank=0, c=c, temp_dir=temp_dir)
File "/home/bairu/workspace/DatasetGEN/styleGan3/train.py", line 49, in subprocess_fn
training_loop.training_loop(rank=rank, **c)
File "/home/bairu/workspace/DatasetGEN/styleGan3/training/training_loop.py", line 164, in training_loop
misc.copy_params_and_buffers(resume_data[name], module, require_all=False)
File "/home/bairu/workspace/DatasetGEN/styleGan3/torch_utils/misc.py", line 162, in copy_params_and_buffers
tensor.copy(src_tensors[name].detach()).requires_grad(tensor.requires_grad)
RuntimeError: output with shape [64, 1, 1, 1] doesn't match the broadcast shape [64, 3, 1, 1]

Expected behavior
The target data I want to generate is a single channel grayscale image. When I use grayscale images for training, it will improve this error.

Desktop (please complete the following information):

OS: Linux Ubuntu 20.04
pytorch 1.9.0
CUDA toolkit version CUDA 11.1

Additional context
If pre-trained models are not used, it is feasible. This seems to be because the input of the pre trained model is three channels? What should I do if I want to use a pre-trained model for training single channel images?

Neilstid · 2024-04-02T09:26:26Z

This cannot work since you are loading a trained model that have been trained to generate RGB images (3 channels).
In my opinion there is three solution:
-You modify the training_loop.py so that after loading the Generator only outputs one channel (either R, G or B) -> Not the easiest but I think it may work

You train from scratch your network (if your data has 1 channel, the generated images will be 1 channel too)
You modify the input images to be 3 channels (same channel repeated 3 times after loading the image in dataset.py). You can then select one of the 3 channel as your grayscale image

I hope it will help you :)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

About training grayscale images #635

About training grayscale images #635

rememberBr commented Feb 2, 2024

Neilstid commented Apr 2, 2024

About training grayscale images #635

About training grayscale images #635

Comments

rememberBr commented Feb 2, 2024

Neilstid commented Apr 2, 2024