Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error in custom dataset. #9

Open
nightandweather opened this issue Jun 7, 2023 · 1 comment
Open

Error in custom dataset. #9

nightandweather opened this issue Jun 7, 2023 · 1 comment

Comments

@nightandweather
Copy link

nightandweather commented Jun 7, 2023

Thank you for sharing code!

After building a custom dataset, learning vqgan3d model and trying to execute ddpm model,

[2023-06-07 15:09:43,745][torch.distributed.nn.jit.instantiator][INFO] - Created a temporary directory at /tmp/tmp_69yy41e
[2023-06-07 15:09:43,746][torch.distributed.nn.jit.instantiator][INFO] - Writing /tmp/tmp_69yy41e/_remote_module_non_scriptable.py
/home/airfmt/anaconda3/envs/dosepaint/lib/python3.9/site-packages/torchvision/models/_utils.py:208: UserWarning: The parameter 'pretrained' is deprecated since 0.13 and will be removed in 0.15, please use 'weights' instead.
  warnings.warn(
/home/airfmt/anaconda3/envs/dosepaint/lib/python3.9/site-packages/torchvision/models/_utils.py:223: UserWarning: Arguments other than a weight enum or `None` for 'weights' are deprecated since 0.13 and will be removed in 0.15. The current behavior is equivalent to passing `weights=VGG16_Weights.IMAGENET1K_V1`. You can also use `weights=VGG16_Weights.DEFAULT` to get the most up-to-date weights.
  warnings.warn(msg)
loaded pretrained LPIPS loss from /home/airfmt/DosePainting/src/vq_gan_3d/model/cache/vgg.pth
Error executing job with overrides: ['model=ddpm', 'dataset=Brain_TR_GammaKnife', 'model.results_folder_postfix=Brain_TR_GammaKnife_ddpm', 'model.vqgan_ckpt=/home/airfmt/DosePainting/src/checkpoints/vq_gan/Brain_TR_GammaKnife/lightning_logs/version_1/checkpoints/epoch\\=656-step\\=42000-train/recon_loss\\=0.11.ckpt', 'model.diffusion_img_size=32', 'model.diffusion_depth_size=32', 'model.diffusion_num_channels=8', 'model.dim_mults=[1,2,4,8]', 'model.batch_size=10', 'model.gpus=1']
Traceback (most recent call last):
  File "/home/airfmt/DosePainting/train/train_ddpm.py", line 54, in run
    trainer = Trainer(
  File "/home/airfmt/DosePainting/src/ddpm/diffusion.py", line 984, in __init__
    self.ema_model = copy.deepcopy(self.model)
  File "/home/airfmt/anaconda3/envs/dosepaint/lib/python3.9/copy.py", line 172, in deepcopy
    y = _reconstruct(x, memo, *rv)
  File "/home/airfmt/anaconda3/envs/dosepaint/lib/python3.9/copy.py", line 270, in _reconstruct
    state = deepcopy(state, memo)
  File "/home/airfmt/anaconda3/envs/dosepaint/lib/python3.9/copy.py", line 146, in deepcopy
    y = copier(x, memo)
  File "/home/airfmt/anaconda3/envs/dosepaint/lib/python3.9/copy.py", line 230, in _deepcopy_dict
    y[deepcopy(key, memo)] = deepcopy(value, memo)
  File "/home/airfmt/anaconda3/envs/dosepaint/lib/python3.9/copy.py", line 172, in deepcopy
    y = _reconstruct(x, memo, *rv)
  File "/home/airfmt/anaconda3/envs/dosepaint/lib/python3.9/copy.py", line 296, in _reconstruct
    value = deepcopy(value, memo)
...
    rv = reductor(4)
TypeError: cannot pickle '_thread.lock' object
# vqgan
!PL_TORCH_DISTRIBUTED_BACKEND=gloo CUDA_VISIBLE_DEVICES=0 python train/train_vqgan.py dataset="Brain_TR_GammaKnife" model=vq_gan_3d model.gpus=1 model.precision=16 model.embedding_dim=8 model.n_hiddens=16 model.downsample=[2,2,2] model.num_workers=32 model.gradient_clip_val=1.0 model.lr=3e-4 model.discriminator_iter_start=10000 model.perceptual_weight=4 model.image_gan_weight=1 model.video_gan_weight=1 model.gan_feat_weight=4 model.batch_size=2 model.n_codes=16384 model.accumulate_grad_batches=1


#diffusion
!python train/train_ddpm.py model=ddpm dataset="Brain_TR_GammaKnife" model.results_folder_postfix='Brain_TR_GammaKnife_ddpm' model.vqgan_ckpt='/home/airfmt/DosePainting/src/checkpoints/vq_gan/Brain_TR_GammaKnife/lightning_logs/version_1/checkpoints/epoch\=656-step\=42000-train/recon_loss\=0.11.ckpt' model.diffusion_img_size=32 model.diffusion_depth_size=32 model.diffusion_num_channels=8 model.dim_mults=[1,2,4,8] model.batch_size=10 model.gpus=1
@yichuan1998
Copy link

感谢您分享代码!

在构建自定义数据集、学习 vqgan3d 模型并尝试执行 ddpm 模型后,

[2023-06-07 15:09:43,745][torch.distributed.nn.jit.instantiator][INFO] - Created a temporary directory at /tmp/tmp_69yy41e
[2023-06-07 15:09:43,746][torch.distributed.nn.jit.instantiator][INFO] - Writing /tmp/tmp_69yy41e/_remote_module_non_scriptable.py
/home/airfmt/anaconda3/envs/dosepaint/lib/python3.9/site-packages/torchvision/models/_utils.py:208: UserWarning: The parameter 'pretrained' is deprecated since 0.13 and will be removed in 0.15, please use 'weights' instead.
  warnings.warn(
/home/airfmt/anaconda3/envs/dosepaint/lib/python3.9/site-packages/torchvision/models/_utils.py:223: UserWarning: Arguments other than a weight enum or `None` for 'weights' are deprecated since 0.13 and will be removed in 0.15. The current behavior is equivalent to passing `weights=VGG16_Weights.IMAGENET1K_V1`. You can also use `weights=VGG16_Weights.DEFAULT` to get the most up-to-date weights.
  warnings.warn(msg)
loaded pretrained LPIPS loss from /home/airfmt/DosePainting/src/vq_gan_3d/model/cache/vgg.pth
Error executing job with overrides: ['model=ddpm', 'dataset=Brain_TR_GammaKnife', 'model.results_folder_postfix=Brain_TR_GammaKnife_ddpm', 'model.vqgan_ckpt=/home/airfmt/DosePainting/src/checkpoints/vq_gan/Brain_TR_GammaKnife/lightning_logs/version_1/checkpoints/epoch\\=656-step\\=42000-train/recon_loss\\=0.11.ckpt', 'model.diffusion_img_size=32', 'model.diffusion_depth_size=32', 'model.diffusion_num_channels=8', 'model.dim_mults=[1,2,4,8]', 'model.batch_size=10', 'model.gpus=1']
Traceback (most recent call last):
  File "/home/airfmt/DosePainting/train/train_ddpm.py", line 54, in run
    trainer = Trainer(
  File "/home/airfmt/DosePainting/src/ddpm/diffusion.py", line 984, in __init__
    self.ema_model = copy.deepcopy(self.model)
  File "/home/airfmt/anaconda3/envs/dosepaint/lib/python3.9/copy.py", line 172, in deepcopy
    y = _reconstruct(x, memo, *rv)
  File "/home/airfmt/anaconda3/envs/dosepaint/lib/python3.9/copy.py", line 270, in _reconstruct
    state = deepcopy(state, memo)
  File "/home/airfmt/anaconda3/envs/dosepaint/lib/python3.9/copy.py", line 146, in deepcopy
    y = copier(x, memo)
  File "/home/airfmt/anaconda3/envs/dosepaint/lib/python3.9/copy.py", line 230, in _deepcopy_dict
    y[deepcopy(key, memo)] = deepcopy(value, memo)
  File "/home/airfmt/anaconda3/envs/dosepaint/lib/python3.9/copy.py", line 172, in deepcopy
    y = _reconstruct(x, memo, *rv)
  File "/home/airfmt/anaconda3/envs/dosepaint/lib/python3.9/copy.py", line 296, in _reconstruct
    value = deepcopy(value, memo)
...
    rv = reductor(4)
TypeError: cannot pickle '_thread.lock' object
# vqgan
!PL_TORCH_DISTRIBUTED_BACKEND=gloo CUDA_VISIBLE_DEVICES=0 python train/train_vqgan.py dataset="Brain_TR_GammaKnife" model=vq_gan_3d model.gpus=1 model.precision=16 model.embedding_dim=8 model.n_hiddens=16 model.downsample=[2,2,2] model.num_workers=32 model.gradient_clip_val=1.0 model.lr=3e-4 model.discriminator_iter_start=10000 model.perceptual_weight=4 model.image_gan_weight=1 model.video_gan_weight=1 model.gan_feat_weight=4 model.batch_size=2 model.n_codes=16384 model.accumulate_grad_batches=1


#diffusion
!python train/train_ddpm.py model=ddpm dataset="Brain_TR_GammaKnife" model.results_folder_postfix='Brain_TR_GammaKnife_ddpm' model.vqgan_ckpt='/home/airfmt/DosePainting/src/checkpoints/vq_gan/Brain_TR_GammaKnife/lightning_logs/version_1/checkpoints/epoch\=656-step\=42000-train/recon_loss\=0.11.ckpt' model.diffusion_img_size=32 model.diffusion_depth_size=32 model.diffusion_num_channels=8 model.dim_mults=[1,2,4,8] model.batch_size=10 model.gpus=1

Hi, what's your dataset files like? I don't know what's the right form
image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants