-
Notifications
You must be signed in to change notification settings - Fork 27.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Can I use BERT / gpt-2 for text generation #2311
Comments
You could do something like this when using gpt2
|
Yes it is possible u need to take the topk of lm_logits (it will be output[0] in case of gpt)which essentially gives to 50257 probabilities (highest to lowest) which is the vocab size then you need to take top k which gives indices and values, values are nothing but urs scores(0.8, 0.1) and the indices which correspond to the 50257 vocabulary words which u can decode using tokenize decode. |
@patrickvonplaten Amazing thanks! |
Since GPT-2's output is based on byte-pair-encoding tokens and not on words you would have to define your own vocabulary. Having defined your vocabulary, I would simply calculate the probability for each word using the above procedure and then sort the tensor. |
@patrickvonplaten Thanks, you think it will be possible to do it for all (or at least most) of the words in English in my personal MAC? |
Yeah, I think that should definitely be feasible. So if you have a vocabulary of say 300.000 words, I'd estimate that you would have to compute around 200.000 forward passes. You can calculate how much time a forward pass would take by averaging the computation time for 100 times calculating the probability for the word 'desk'. Concerning memory, there should not be a problem. |
And the final vector giving the probabilities over your defined vocabulary should be normalized to make a prob distribution. |
@patrickvonplaten You mean using softmax? |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
I was thinking to just normalize like this: but you could also use softmax again - depends on what you want and what works better for you! |
@patrickvonplaten is it possible with BERT pre-trained model? |
You might take a look at masked language modeling :-) https://huggingface.co/transformers/usage.html#masked-language-modeling |
@patrickvonplaten Nice! Thanks for the pointer! |
❓ Questions & Help
I want to get a list of possible completions and their probabilities.
For example,
For the sentence "I put the glass of the _"
I want to get a vector with word and probabilities from a pre-trained model, such as :
desk = 0.1
table = 0.2
car = 0.05
shirt = 0.001
Is that possible?
The text was updated successfully, but these errors were encountered: