-
Notifications
You must be signed in to change notification settings - Fork 71
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
training problem #40
Comments
Thanks for your interest in our work! I'm not entirely sure what's causing this problem. I think it might be due to the mismatch of PyTorch and CUDA versions. If you have any further questions, please feel free to add comments to this issue. |
Thanks for your reply.I will check these versions of my NVIDIA driver and
CUDA.
Yaoyao Liu ***@***.***> 于2022年3月18日周五 16:10写道:
… Thanks for your interest in our work!
I'm not entirely sure what's causing this problem. I think it might be due
to the mismatch of PyTorch and CUDA versions.
My NVIDIA driver version is 460.84, and my CUDA version is 11.2. I hope
this information might help you.
If you have any further questions, please feel free to add comments to
this issue.
—
Reply to this email directly, view it on GitHub
<#40 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ANGANBH5BIEMWC4D7YX74ULVAQ267ANCNFSM5RBB6MPA>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
You are receiving this because you authored the thread.Message ID:
***@***.***>
|
Hi, I got the same problem here. Do you find out how to fix it? Very appreciated for your help. |
I don't have this issue when I running the code. So, could you please provide your GPU info, PyTorch version, and CUDA version? Thanks a lot! |
Hi, thank you for your reply. My ubuntu version is 20.04, driver version 470.103.01, cuda version: 11.4. Very appreciated that. |
Are you using PyTorch 1.2.0 and Python 3.6? |
Yes, I follow the exact version of them. |
Thanks for providing this information. Currently, I cannot reproduce this issue on my system. Thus, I don't have a solution to this issue. I am very sorry about it. I will keep you posted if I find some new solution. If you find some solutions, you may also post them here. I am truly grateful for it. |
Sure, Thank you for your help. |
Hello, Professor! I have the following problem when running the code on win11. Can you explain what they mean and how to solve the problems? (my graph memory is 8GB) Thank you very much!
python main.py --nb_cl_fg=50 --nb_cl=10 --gpu=0 --random_seed=1993 --baseline=lucir --branch_mode=dual --branch_1=ss --branch_2=free --dataset=cifar100
Namespace(K=2, base_lr1=0.1, base_lr2=0.1, baseline='lucir', branch_1='ss', branch_2='free', branch_mode='dual', ckpt_dir_fg='-', ckpt_label='exp01', custom_momentum=0.9, custom_weight_decay=0.0005, data_dir=
'data/seed_1993_subset_100_imagenet/data', dataset='cifar100', disable_gpu_occupancy=True, dist=0.5, dynamic_budget=False, epochs=160, eval_batch_size=128, fusion_lr=1e-08, gpu='0', icarl_T=2, icarl_beta=0.25
, imgnet_backbone='resnet18', lr_factor=0.1, lw_mr=1, nb_cl=10, nb_cl_fg=50, nb_protos=20, num_classes=100, num_workers=1, random_seed=1993, resume=False, resume_fg=False, test_batch_size=100, the_lambda=5, train_batch_size=128)
Using gpu: 0
Total memory: 8192, used memory: 829
Occupy GPU memory in advance.
Files already downloaded and verified
Files already downloaded and verified
Order name:./logs/cifar100_nfg50_ncls10_nproto20_lucir_dual_b1ss_b2free_fixed_exp01\seed_1993_cifar100_order.pkl
Loading the saved class order
[68, 56, 78, 8, 23, 84, 90, 65, 74, 76, 40, 89, 3, 92, 55, 9, 26, 80, 43, 38, 58, 70, 77, 1, 85, 19, 17, 50, 28, 53, 13, 81, 45, 82, 6, 59, 83, 16, 15, 44, 91, 41, 72, 60, 79, 52, 20, 10, 31, 54, 37, 95, 14, 71, 96, 98, 97, 2, 64, 66, 42, 22, 35, 86, 24, 34, 87, 21, 99, 0, 88, 27, 18, 94, 11, 12, 47, 25, 30, 46, 62, 69, 36, 61, 7, 63, 75, 5, 32, 4, 51, 48, 73, 93, 39, 67, 29, 49, 57, 33]
Feature: 64 Class: 50
Setting the dataloaders ...
Check point name: ./logs/cifar100_nfg50_ncls10_nproto20_lucir_dual_b1ss_b2free_fixed_exp01\iter_4_b1.pth
Epoch: 0, learning rate: 0.1
Traceback (most recent call last):
File "main.py", line 88, in
trainer.train()
File "E:\AlgSpace\pycharm\AANets\trainer\trainer.py", line 171, in train
cur_lambda, self.args.dist, self.args.K, self.args.lw_mr)
File "E:\AlgSpace\pycharm\AANets\trainer\zeroth_phase.py", line 63, in incremental_train_and_eval_zeroth_phase
outputs = b1_model(inputs)
File "E:\Anaconda\envs\aanets\lib\site-packages\torch\nn\modules\module.py", line 547, in call
result = self.forward(*input, **kwargs)
File "E:\AlgSpace\pycharm\AANets\models\modified_resnet_cifar.py", line 109, in forward
x = self.fc(x)
File "E:\Anaconda\envs\aanets\lib\site-packages\torch\nn\modules\module.py", line 547, in call
result = self.forward(*input, **kwargs)
File "E:\AlgSpace\pycharm\AANets\models\modified_linear.py", line 37, in forward
F.normalize(self.weight, p=2, dim=1))
File "E:\Anaconda\envs\aanets\lib\site-packages\torch\nn\functional.py", line 1371, in linear
output = input.matmul(weight.t())
RuntimeError: CUDA error: CUBLAS_STATUS_EXECUTION_FAILED when calling
cublasSgemm( handle, opa, opb, m, n, k, &alpha, a, lda, b, ldb, &beta, c, ldc)
The text was updated successfully, but these errors were encountered: