Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RuntimeError: CUDA error: CUBLAS_STATUS_NOT_INITIALIZED when calling cublasCreate(handle) #6263

Closed
mt324010 opened this issue Aug 5, 2020 · 20 comments

Comments

@mt324010
Copy link

mt324010 commented Aug 5, 2020

Hi, I tried to add some other embeddings in your BertEmbedding source code and then load the pretrained weights 'bert-base-chinese'.
When I run the forward method, I got the issue
'RuntimeError: CUDA error: CUBLAS_STATUS_NOT_INITIALIZED when calling cublasCreate(handle)'
Can someone help please? Thanks a lot

@mt324010 mt324010 closed this as completed Aug 5, 2020
@vdabravolski
Copy link

I'm getting the same issue. Curious how did you solve yours? @mt324010

@mt324010
Copy link
Author

I'm getting the same issue. Curious how did you solve yours? @mt324010
I was using another embedding and the index was out of the given range.
I think it's better to double check your code

@vdabravolski
Copy link

Yeah, you're right. I had problem with tokenizer length in my case.

@hudaAlamri
Copy link

hudaAlamri commented Aug 25, 2020

I am getting the same error. I am trying to update the token_type_embeddings by having 4 types instead of 2.

model.config.type_vocab_size = 4
   
token_embed = nn.Embedding(model.config.type_vocab_size, model.config.hidden_size)

token_embed.weight.data.uniform_(-1,1)
     
model.bert.embeddings.token_type_embeddings = token_embed

@vdabravolski as for the tokenizer, I added special tokens and updated the length of the tokenizer and resized the model token_embeddings:

 tokenizer.add_special_tokens(SPECIAL_TOKENS_DICT) 
 
  model.resize_token_embeddings(len(tokenizer))

@danyaljj
Copy link
Contributor

I had problem with tokenizer length in my case.

Could you elaborate on this?

@manalabssas
Copy link

manalabssas commented Oct 27, 2020

Try removing/deleting the cached .lock files and run again

@tonytan48
Copy link

I think one of the possible reasons is that your padding token for token_type_id is out of range. Say you have four extra token_type_ids, then ’pad‘ , 'cls' and 'unk' may follow your tokenizer setting. BERT uses a large number for pad(100 something), then if your token_type_embedding is initialized to be only 4 class, it will result in similar error. So you might increase your token type vocabulary to consider special tokens and manually set them to 0,1,2 etc. Hope it helps.

@joawar
Copy link
Contributor

joawar commented Jan 14, 2021

I had not given my model the vocab size of my tokenizer when I initialized it, which gave me this error. Running the model on the CPU (as suggested here #3090) gave me a better error message that let me figure this out, so that's a more general tip if you get this error I guess.

@Kevinweisl
Copy link

Try removing/deleting the cached .lock files and run again

Thanks @manalabssas
I'm getting the same issue. I try to delete all cache files, and it works.
Thanks for your sharing.

@keloemma
Copy link

keloemma commented Oct 1, 2021

Try removing/deleting the cached .lock files and run again

Thanks @manalabssas I'm getting the same issue. I try to delete all cache files, and it works. Thanks for your sharing.

Hello how did you delete all cache files ? I ma getting the same problem ?

@minji-o-j
Copy link

I changed return_token_type_ids True->False in tokenizer

return_token_type_ids=False

@jhillhouse92
Copy link

I had not given my model the vocab size of my tokenizer when I initialized it, which gave me this error. Running the model on the CPU (as suggested here #3090) gave me a better error message that let me figure this out, so that's a more general tip if you get this error I guess.

This helped me solve my issue. I had initialized different versions of the from_pretrained with the tokenizer vs the model (e.g. from_pretrained('bert-large-uncased') and from_pretrained('bert-large-cased')).

@tonywenuon
Copy link

I think one of the possible reasons is that your padding token for token_type_id is out of range. Say you have four extra token_type_ids, then ’pad‘ , 'cls' and 'unk' may follow your tokenizer setting. BERT uses a large number for pad(100 something), then if your token_type_embedding is initialized to be only 4 class, it will result in similar error. So you might increase your token type vocabulary to consider special tokens and manually set them to 0,1,2 etc. Hope it helps.

Yes, this is my case. I got it solved.

@wei-ann-Github
Copy link

I think one of the possible reasons is that your padding token for token_type_id is out of range. Say you have four extra token_type_ids, then ’pad‘ , 'cls' and 'unk' may follow your tokenizer setting. BERT uses a large number for pad(100 something), then if your token_type_embedding is initialized to be only 4 class, it will result in similar error. So you might increase your token type vocabulary to consider special tokens and manually set them to 0,1,2 etc. Hope it helps.

Yes, this is my case. I got it solved.

Hi @tonywenuon , may I know how did you increase your token type vocabulary?

@Charles-Zhong
Copy link

Try removing/deleting the cached .lock files and run again

very useful!~

@jsjy1
Copy link

jsjy1 commented Sep 27, 2022

I solved it by reducing batch _ size.

@stribizhev
Copy link

In my case, I had to use device="cuda:8" to specify a GPU core other than the default 0.

@charlesxu90
Copy link

I had the same error. But later I found it is because that the CUDA driver didn't load as expected. Restart the OS resolved this problem

@tornikeo
Copy link

In my case, a simple notebook restart helped for some odd reason.

@Snail1502
Copy link

I solved it by reducing batch _ size.

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests