Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ImportError: cannot import name 'convert_splitbn_model' from 'timm.models' #23

Open
watertianyi opened this issue Aug 23, 2024 · 7 comments
Assignees
Labels
question Further information is requested

Comments

@watertianyi
Copy link

Package Version


certifi 2024.7.4
charset-normalizer 3.3.2
filelock 3.15.4
fsspec 2024.6.1
fvcore 0.1.5.post20221221
huggingface-hub 0.24.6
idna 3.7
iopath 0.1.10
Jinja2 3.1.4
MarkupSafe 2.1.5
mpmath 1.3.0
networkx 3.1
numpy 1.24.4
nvidia-cublas-cu12 12.1.3.1
nvidia-cuda-cupti-cu12 12.1.105
nvidia-cuda-nvrtc-cu12 12.1.105
nvidia-cuda-runtime-cu12 12.1.105
nvidia-cudnn-cu12 9.1.0.70
nvidia-cufft-cu12 11.0.2.54
nvidia-curand-cu12 10.3.2.106
nvidia-cusolver-cu12 11.4.5.107
nvidia-cusparse-cu12 12.1.0.106
nvidia-nccl-cu12 2.20.5
nvidia-nvjitlink-cu12 12.6.20
nvidia-nvtx-cu12 12.1.105
packaging 24.1
pillow 10.4.0
pip 24.2
portalocker 2.10.1
PyYAML 6.0.2
requests 2.32.3
safetensors 0.4.4
setuptools 72.1.0
sympy 1.13.2
tabulate 0.9.0
termcolor 2.4.0
timm 1.0.8
torch 2.4.0
torchvision 0.19.0
tqdm 4.66.5
triton 3.0.0
typing_extensions 4.12.2
urllib3 2.2.2
wheel 0.43.0
yacs 0.1.8

@Lupin1998
Copy link
Member

Hi, @goldwater668. Thanks for using MogaNet. Unfortunately, this error might be caused by timm itself in its new version. I suggest you search (or raise) this issue under the repo of timm, or you could try an early version of timm (e.g., 0.6.11).

@Lupin1998 Lupin1998 self-assigned this Aug 23, 2024
@Lupin1998 Lupin1998 added the question Further information is requested label Aug 23, 2024
@watertianyi
Copy link
Author

watertianyi commented Aug 23, 2024

CUDA_VISIBLE_DEVICES=0 python validate.py
--model moganet_xtiny
--img_size 224
--amp
--batch_size 4
--crop_pct 0.9
--data_dir /val_NG_single_result
--dataset val_NG_Qiepian_all
--checkpoint MogaNet/save_results/model_best.pth.tar
--num_classes 2
--results_file MogaNet/0823_val.csv

Test: [ 0/3761] Time: 15.820s (15.820s, 0.25/s) Loss: 0.1131 (0.1131) Acc@1: 100.000 (100.000) Acc@5: 100.000 (100.000)
Test: [ 10/3761] Time: 0.036s (1.471s, 2.72/s) Loss: 0.1017 (0.1100) Acc@1: 100.000 (100.000) Acc@5: 100.000 (100.000)
Test: [ 20/3761] Time: 0.037s (0.787s, 5.08/s) Loss: 0.1102 (0.1100) Acc@1: 100.000 (100.000) Acc@5: 100.000 (100.000)
Test: [ 30/3761] Time: 0.036s (0.545s, 7.34/s) Loss: 0.1129 (0.1105) Acc@1: 100.000 (100.000) Acc@5: 100.000 (100.000)
Test: [ 40/3761] Time: 0.037s (0.421s, 9.50/s) Loss: 0.1063 (0.1100) Acc@1: 100.000 (100.000) Acc@5: 100.000 (100.000)
Test: [ 50/3761] Time: 0.035s (0.346s, 11.57/s) Loss: 0.1151 (0.1103) Acc@1: 100.000 (100.000) Acc@5: 100.000 (100.000)
Test: [ 60/3761] Time: 0.036s (0.295s, 13.56/s) Loss: 0.1061 (0.1098) Acc@1: 100.000 (100.000) Acc@5: 100.000 (100.000)
Test: [ 70/3761] Time: 0.036s (0.259s, 15.47/s) Loss: 0.1130 (0.1099) Acc@1: 100.000 (100.000) Acc@5: 100.000 (100.000)
Test: [ 80/3761] Time: 0.036s (0.231s, 17.31/s) Loss: 0.1202 (0.1104) Acc@1: 100.000 (100.000) Acc@5: 100.000 (100.000)
Test: [ 90/3761] Time: 0.037s (0.210s, 19.09/s) Loss: 0.1126 (0.1107) Acc@1: 100.000 (100.000) Acc@5: 100.000 (100.000)
Test: [ 100/3761] Time: 0.036s (0.192s, 20.80/s) Loss: 0.1031 (0.1106) Acc@1: 100.000 (100.000) Acc@5: 100.000 (100.000)
Test: [ 110/3761] Time: 0.038s (0.178s, 22.45/s) Loss: 0.1022 (0.1105) Acc@1: 100.000 (100.000) Acc@5: 100.000 (100.000)
Test: [ 120/3761] Time: 0.035s (0.167s, 24.02/s) Loss: 0.1063 (0.1103) Acc@1: 100.000 (100.000) Acc@5: 100.000 (100.000)
Test: [ 130/3761] Time: 0.035s (0.157s, 25.54/s) Loss: 0.1071 (0.1102) Acc@1: 100.000 (100.000) Acc@5: 100.000 (100.000)
Test: [ 140/3761] Time: 0.037s (0.148s, 27.02/s) Loss: 0.1011 (0.1101) Acc@1: 100.000 (100.000) Acc@5: 100.000 (100.000)
Test: [ 150/3761] Time: 0.036s (0.141s, 28.45/s) Loss: 0.1165 (0.1105) Acc@1: 100.000 (100.000) Acc@5: 100.000 (100.000)
Test: [ 160/3761] Time: 0.036s (0.134s, 29.83/s) Loss: 0.1087 (0.1106) Acc@1: 100.000 (100.000) Acc@5: 100.000 (100.000)
Test: [ 170/3761] Time: 0.034s (0.128s, 31.17/s) Loss: 0.1178 (0.1111) Acc@1: 100.000 (100.000) Acc@5: 100.000 (100.000)
Test: [ 180/3761] Time: 0.036s (0.123s, 32.48/s) Loss: 0.1040 (0.1109) Acc@1: 100.000 (100.000) Acc@5: 100.000 (100.000)
Test: [ 190/3761] Time: 0.035s (0.119s, 33.73/s) Loss: 0.1123 (0.1105) Acc@1: 100.000 (100.000) Acc@5: 100.000 (100.000)
Test: [ 200/3761] Time: 0.036s (0.114s, 34.94/s) Loss: 0.1131 (0.1103) Acc@1: 100.000 (100.000) Acc@5: 100.000 (100.000)
Test: [ 210/3761] Time: 0.036s (0.111s, 36.10/s) Loss: 2.2324 (0.1930) Acc@1: 0.000 ( 96.090) Acc@5: 100.000 (100.000)
Test: [ 220/3761] Time: 0.034s (0.107s, 37.26/s) Loss: 2.1895 (0.2869) Acc@1: 0.000 ( 91.742) Acc@5: 100.000 (100.000)
Test: [ 230/3761] Time: 0.036s (0.104s, 38.37/s) Loss: 2.2363 (0.3717) Acc@1: 0.000 ( 87.771) Acc@5: 100.000 (100.000)
Test: [ 240/3761] Time: 0.035s (0.101s, 39.45/s) Loss: 2.2461 (0.4487) Acc@1: 0.000 ( 84.129) Acc@5: 100.000 (100.000)
Test: [ 250/3761] Time: 0.035s (0.099s, 40.50/s) Loss: 2.4180 (0.5219) Acc@1: 0.000 ( 80.777) Acc@5: 100.000 (100.000)
Test: [ 260/3761] Time: 0.035s (0.096s, 41.53/s) Loss: 2.2832 (0.5877) Acc@1: 0.000 ( 77.682) Acc@5: 100.000 (100.000)
Test: [ 270/3761] Time: 0.034s (0.094s, 42.53/s) Loss: 2.2188 (0.6490) Acc@1: 0.000 ( 74.815) Acc@5: 100.000 (100.000)
Test: [ 280/3761] Time: 0.036s (0.092s, 43.49/s) Loss: 2.2812 (0.7058) Acc@1: 0.000 ( 72.153) Acc@5: 100.000 (100.000)
Test: [ 290/3761] Time: 0.034s (0.090s, 44.44/s) Loss: 2.1133 (0.7576) Acc@1: 0.000 ( 69.674) Acc@5: 100.000 (100.000)
Test: [ 300/3761] Time: 0.037s (0.088s, 45.35/s) Loss: 2.2461 (0.8075) Acc@1: 0.000 ( 67.359) Acc@5: 100.000 (100.000)
../aten/src/ATen/native/cuda/Loss.cu:250: nll_loss_forward_reduce_cuda_kernel_2d: block: [0,0,0], thread: [0,0,0] Assertion t >= 0 && t < n_classes failed.
../aten/src/ATen/native/cuda/Loss.cu:250: nll_loss_forward_reduce_cuda_kernel_2d: block: [0,0,0], thread: [1,0,0] Assertion t >= 0 && t < n_classes failed.
../aten/src/ATen/native/cuda/Loss.cu:250: nll_loss_forward_reduce_cuda_kernel_2d: block: [0,0,0], thread: [2,0,0] Assertion t >= 0 && t < n_classes failed.
../aten/src/ATen/native/cuda/Loss.cu:250: nll_loss_forward_reduce_cuda_kernel_2d: block: [0,0,0], thread: [3,0,0] Assertion t >= 0 && t < n_classes failed.
return torch._C._nn.cross_entropy_loss(input, target, weight, _Reduction.get_enum(reduction), ignore_index, label_smoothing)
RuntimeError: CUDA error: device-side assert triggered
Compile with TORCH_USE_CUDA_DSA to enable device-side assertions.

@Lupin1998
Copy link
Member

Please check your dataset. The provided class number doesn't match the provided label.

@watertianyi
Copy link
Author

可以说的具体点吗,数据集目录与命名需要另外标注吗

@Lupin1998
Copy link
Member

If you are using a custom dataset, please check these issues by yourself. If you are using ImageNet, please set --num_classes 1000.

@watertianyi
Copy link
Author

@Lupin1998 如果已经训练了2个类别,现在新增第三个类别,是否可以增加第三个类别数据进行微调呢

@watertianyi
Copy link
Author

@Lupin1998
Can this network be added to a feature pyramid network to be compatible with images of different scales?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants