You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I was trying to reproduce results by running your code, and couldn't get exactly the same precision on SQuAD.
Here is what I got for bert_large model on SQuAD: all_samples: 303 list_of_results: 303 global MRR: 0.3018861233236291 global Precision at 10: 0.5676567656765676 global Precision at 1: 0.16831683168316833
However, in the paper, the table shows that there should be 305 samples and the precision should be 17.4%.
At first, I guessed that it is because 2 samples are excluded because their object labels are out of the common vocabulary, but even after testing without common vocabulary, I got global Precision at 1: 0.1704918, which is still different to results in the paper.
Is there a way to reproduce the same results in the paper?
Please correct me if I made any mistakes! Thanks!
The text was updated successfully, but these errors were encountered:
strange.
Just re-executed the run_experiments scripts and I get P@1 : 0.1737704918032787 for the BERT-large model. Are you using BERT-large?
Also, the script should use all the 305 examples.
This is how your output should look like:
Hi,
I was trying to reproduce results by running your code, and couldn't get exactly the same precision on SQuAD.
Here is what I got for bert_large model on SQuAD:
all_samples: 303
list_of_results: 303
global MRR: 0.3018861233236291
global Precision at 10: 0.5676567656765676
global Precision at 1: 0.16831683168316833
However, in the paper, the table shows that there should be 305 samples and the precision should be 17.4%.
At first, I guessed that it is because 2 samples are excluded because their object labels are out of the common vocabulary, but even after testing without common vocabulary, I got
global Precision at 1: 0.1704918
, which is still different to results in the paper.Is there a way to reproduce the same results in the paper?
Please correct me if I made any mistakes! Thanks!
The text was updated successfully, but these errors were encountered: