-
-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CUDA Error during CharacterEmbeddings #421
Comments
Thanks for reporting :) I think this could be fixed by: Current code: # chars for rnn processing
chars = torch.LongTensor(tokens_mask)
chars = chars.to(flair.device)
character_embeddings = self.char_embedding(chars).transpose(0, 1) Fix: # chars for rnn processing
chars = torch.LongTensor(tokens_mask)
chars = chars.to(flair.device)
chars = chars.detach().cpu() # <-- added
character_embeddings = self.char_embedding(chars).transpose(0, 1) |
Thank you. It solves the problem. |
Could you also try this fix? # chars for rnn processing
chars = torch.LongTensor(tokens_mask)
chars = chars.cpu() # <-- added (don't detach!)
character_embeddings = self.char_embedding(chars).transpose(0, 1) I think that if you detach the vector, the gradients cannot flow into the character model during training and features are not trained. I.e. the character features would stay random. If you do not detach with the code above, training will be slower but this way the character features are always trained on the downstream task like proposed by Lample et al., 2016. |
It seemed to be fixed but now I get the same error while executing the real program. (I tested it with .detach() and without .detach() -> same error) |
I think same error occurred when I tried the BERT tutorial example. |
Hello @lisette-garciamoya what do you mean by executing the real program? |
Hi @lisette-garciamoya - I was able to understand where the error is coming from. In fact, the original code of the CharacterEmbeddings class is correct. However, when you instantiate the CharacterEmbeddings, by default it is only instantiated on CPU. The ModelTrainer then puts it on GPU which is why the training works. But if you instantiate the CharacterEmbeddings yourself, it is only on CPU even if you are on a GPU machine, which causes the error. For now, the simplest fix is to do this: from flair.embeddings import CharacterEmbeddings
sentence = Sentence('La casa es muy bonita.', use_tokenizer=True)
embedding = CharacterEmbeddings()
embeddings = embeddings.cuda() # add this line to put the embeddings on CUDA
embedding.embed(sentence)
for token in sentence:
print(token)
print(token.embedding) Could you test if this works for you? We will also set up a PR that fixes this behavior. Default behavior should be that embeddings are instantiated on cuda if available. Thanks for finding this error and reporting it! |
Should be fixed by the latest PR. Feel free to reopen if there are still issues! Thanks again for reporting the error! |
To me, I used init embeddingflair_embedding_forward = FlairEmbeddings('news-forward') create a sentencesentence = Sentence('The grass is green .') embed words in sentencex = bert_embedding.embed(sentence) for token in sentence: |
My code
Errors
Environment:
The text was updated successfully, but these errors were encountered: