-
Notifications
You must be signed in to change notification settings - Fork 488
how to output sentence's probability? #96
Comments
Same question here! @tsungruihon did you find a solution? |
@wolflo no i haven't, still working in progress. |
@tsungruihon I calculate the likelihood of an input sentence by summing the log-probabilities output by the model for each word of the input sentence. It looks like this:
|
@wolflo thanks my friend! Nice work! |
Hi, thanks @wolflo ! One thing is confusing me - does this also take into account the probability of the first token in the sentence? (i.e., the probability the model assigns to the first token when in the state given by model.initHidden?) |
Hi @gailweiss , an approximation of the log probability of the first token in the sentence should be given by |
Hi @wolflo , thanks for the quick response! isn't more directly, isn't |
Oh, I see! That's an excellent remark. Then, I think you could rewrite the above scoring function as:
What do you think? |
This seems to make sense :) thank you for taking the time to get into this! I assume/hope the way the models here are trained, one sequence begins after the of the previous, i.e. I hope that the training in this repository also trains the distribution after . But at any rate this is a consistent solution and its just a question of whether the model optimises appropriately, which is something else. Thank you! |
Indeed, training in this repo is performed over a long tensor representing the concatenation of all the sentences of the corpus, with the tag Thank you for pointing out this issue ! |
Hi, @wolflo, thanks for the code, I have one issue related to the next word prediction, because by given word and previous hidden states we could try to predict the next most probable word according to softmax probability distribution. Did you try to do this with your function?
May be you faced with this issue before? Thanks. |
I tried sentence generation some time ago with the awd-lstm model trained on wikitext-2. Results were pretty poor for me too. You might improve generation quality by adjusting the temperature, by using some tricks like beam search or by training the model on bigger datasets. Unfortunately, I do not have time to dig further into this right now. Should I work on this in the future, I will let you know ! Have a good day :) |
May i ask how to use
awd-lstm-lm
to output sentence's probability ?The text was updated successfully, but these errors were encountered: