-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RuntimeError: Error(s) in loading state_dict for Mamba2DModel: size mismatch for additional_embed: copying a param with shape torch.Size([1, 1026, 1536]) from checkpoint, the shape in current model is torch.Size([1, 258, 1536]). #9
Comments
It seems that you load the weights of model trained on Since you get the message |
imagenet256_H_DiM.py def d(**kwargs): def get_config():
downloaded the checkpoint with https://drive.google.com/drive/folders/1TTEXKKhnJcEV9jeZbZYlXjiPyV87ZhE0?usp=sharing |
ImageNet 64x64: Put the standard ImageNet dataset (which contains the train and val directory) to assets/datasets/ImageNet. Currently, I have downloaded the ImageNet dataset and placed it according to the prescribed path, but I have not processed it yet. Is it necessary to preprocess the dataset into a 256x256 format? Or does the program automatically handle the dataset formatting? |
There is no necessary to preprocess the datasets where images are smaller than |
The path of the image samples in our imageNet dataset is like |
I'm sorry I accidentally hit the edit key on your reply of uploading the config.😂 After reading, I think your provided config is correct. However, your checkpoint should not contain Have you loaded the checkpoint correctly? Since others have succeeded, it may not be a problem with me. #8 (comment) |
CUDA_VISIBLE_DEVICES="0" python ./eval_ldm_discrete.py --config=configs/imagenet256_H_DiM.py --nnet_path='workdir/imagenet256_H_DiM/default/ckpts/425000.ckpt/nnet.pth' Loading the nnet.pth still fails. Are you sure the model you uploaded is correct? I've noticed that the filenames for the 256-resolution and 512-resolution models are identical. The configuration provided by the other individual suggests they might have been using a model they trained themselves. Currently, I need to load the model that you trained. |
Could you please send me the trained model for 256 resolution? |
OK, I will upload my best 256 model later |
After successfully sending it, could you please provide me with a link or privately send a copy to my email address HaiLi086@163.com? I am highly interested in your work and would greatly appreciate it! |
Thank you for your appreciation! I would upload it into this repo, and update this: ImageNet 256x256 (Huge/2) | 2.21 | 625K | 768 -- | -- | -- | -- |
I previously downloaded it from here: ImageNet 256x256 (Huge/2) 2.40 425K 768 |
If you have not prepared your dataset well, you can modify this line of your config to
This setting requires NO prepared datasets for evaluation |
I know. I would give you a new link. |
How do I prepare the dataset? I'm unable to properly run the script file scripts/extract_imagenet_feature.py. python scripts/extract_imagenet_feature.py My dataset path is: /home/lihao/DiM-DiffusionMamba/assets/datasets/ImageNet/train/n01440764/n01440764_18.JPEG. The images in my dataset have been downloaded but not processed further. How come there is an imagenet256_features folder? |
Here is the best 256 model: https://drive.google.com/drive/folders/1ETllUm8Dpd8-vDHefQEXEWF9whdbyhL5?usp=sharing You can place the new checkpoint to the path Then, execute this ( I just tested it, and it works well ):
|
First, I do not use the latent extraction for 256 features in the configs of this open source code. |
Thank you for your meticulous guidance; I have resolved all of my issues. |
OK, I will close the issue. |
[2024-06-27 08:23:37,448] [INFO] [real_accelerator.py:203:get_accelerator] Setting ds_accelerator to cuda (auto detect)
[WARNING] async_io requires the dev libaio .so object and headers but these were not found.
[WARNING] async_io: please install the libaio-dev package with apt
[WARNING] If libaio is already installed (perhaps from source), try setting the CFLAGS and LDFLAGS environment variables to where it can be found.
[WARNING] Please specify the CUTLASS repo directory as environment variable $CUTLASS_PATH
[WARNING] NVIDIA Inference is only supported on Ampere and newer architectures
[WARNING] sparse_attn requires a torch version >= 1.5 and < 2.0 but detected 2.1
[WARNING] using untested triton version (2.1.0), only 1.0.0 is known to be compatible
I0627 08:23:38.489170 136925989832512 eval_ldm_discrete.py:140] Process 0 using device: cuda
Counting ImageNet files from assets/datasets/ImageNet
Finish counting ImageNet files
Missing train samples: 1280444 < 1281167
1000 classes
cnt[:10]: tensor([1300., 1300., 1300., 1300., 1300., 1300., 1300., 1300., 1300., 1300.])
frac[:10]: [tensor(0.0010), tensor(0.0010), tensor(0.0010), tensor(0.0010), tensor(0.0010), tensor(0.0010), tensor(0.0010), tensor(0.0010), tensor(0.0010), tensor(0.0010)]
prepare the dataset for classifier free guidance with p_uncond=0.1
2024-06-27 08:23:41,511 - _cpp_lib.py - WARNING[XFORMERS]: xFormers can't load C++/CUDA extensions. xFormers was built for:
PyTorch 2.3.0+cu121 with CUDA 1201 (you have 2.1.1)
Python 3.9.19 (you have 3.9.19)
Please reinstall xformers (see https://github.com/facebookresearch/xformers#installing-xformers)
Memory-efficient attention, SwiGLU, sparse and more won't be available.
Set XFORMERS_MORE_DETAILS=1 for more details
2024-06-27 08:23:56,201 - eval_ldm_discrete.py - load nnet from workdir/imagenet256_H_DiM/default/ckpts/425000.ckpt/nnet_ema.pth
╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ /home/lihao/DiM-DiffusionMamba/./eval_ldm_discrete.py:341 in │
│ │
│ 338 │
│ 339 │
│ 340 if name == "main": │
│ ❱ 341 │ app.run(main) │
│ 342 │
│ │
│ /home/lihao/anaconda3/envs/mamba-attn/lib/python3.9/site-packages/absl/app.py:308 in run │
│ │
│ 305 │ callback = _init_callbacks.popleft() │
│ 306 │ callback() │
│ 307 │ try: │
│ ❱ 308 │ _run_main(main, args) │
│ 309 │ except UsageError as error: │
│ 310 │ usage(shorthelp=True, detailed_error=error, exitcode=error.exitcode) │
│ 311 │ except: │
│ │
│ /home/lihao/anaconda3/envs/mamba-attn/lib/python3.9/site-packages/absl/app.py:254 in _run_main │
│ │
│ 251 │ atexit.register(profiler.print_stats) │
│ 252 │ sys.exit(profiler.runcall(main, argv)) │
│ 253 else: │
│ ❱ 254 │ sys.exit(main(argv)) │
│ 255 │
│ 256 │
│ 257 def call_exception_handlers(exception): │
│ │
│ /home/lihao/DiM-DiffusionMamba/./eval_ldm_discrete.py:337 in main │
│ │
│ 334 │ config = FLAGS.config │
│ 335 │ config.nnet_path = FLAGS.nnet_path │
│ 336 │ config.output_path = FLAGS.output_path │
│ ❱ 337 │ evaluate(config) │
│ 338 │
│ 339 │
│ 340 if name == "main": │
│ │
│ /home/lihao/DiM-DiffusionMamba/./eval_ldm_discrete.py:156 in evaluate │
│ │
│ 153 │ nnet = accelerator.prepare(nnet) │
│ 154 │ logging.info(f'load nnet from {config.nnet_path}') │
│ 155 │ if (config.nnet_path is not None) and (config.sample.algorithm != 'dpm_solver_upsamp │
│ ❱ 156 │ │ accelerator.unwrap_model(nnet).load_state_dict(torch.load(config.nnet_path, map │
│ 157 │ else: │
│ 158 │ │ accelerator.unwrap_model(nnet) │
│ 159 │
│ │
│ /home/lihao/anaconda3/envs/mamba-attn/lib/python3.9/site-packages/torch/nn/modules/module.py:215 │
│ 2 in load_state_dict │
│ │
│ 2149 │ │ │ │ │ │ ', '.join(f'"{k}"' for k in missing_keys))) │
│ 2150 │ │ │
│ 2151 │ │ if len(error_msgs) > 0: │
│ ❱ 2152 │ │ │ raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format( │
│ 2153 │ │ │ │ │ │ │ self.class.name, "\n\t".join(error_msgs))) │
│ 2154 │ │ return _IncompatibleKeys(missing_keys, unexpected_keys) │
│ 2155 │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
RuntimeError: Error(s) in loading state_dict for Mamba2DModel:
size mismatch for additional_embed: copying a param with shape torch.Size([1, 1026, 1536]) from checkpoint, the shape in current model is torch.Size([1, 258, 1536]).
The text was updated successfully, but these errors were encountered: