You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This is really great work and the latency is truly better than other models.
However I do have some questions, I was hoping you could give me some insights.
Using GPT2-tokenizer, the positions of each mentions usually corresponds to the position_of_the_mentions+1 in the AIDA data you are using (except for the start of a mention that is the first word in the text).
Example from the aida_test_dataset:
You can notice that the JAPAN position within the list should be [4,6] instead of [5,7], and that the Rugby Union mention should be [0,5] instead of [0,6]. So I was simply wondering if that is normal ?
Also if it's not too much, could you explain what would happen with entities.json if the datasets for training don't have "candidates" key ?
Finally, about the mentions.json, there are a ton of '!', should I replicate that in my own mentions.json ?
Thank you for your help and for this repo !
The text was updated successfully, but these errors were encountered:
Hello,
This is really great work and the latency is truly better than other models.
However I do have some questions, I was hoping you could give me some insights.
Using GPT2-tokenizer, the positions of each mentions usually corresponds to the position_of_the_mentions+1 in the AIDA data you are using (except for the start of a mention that is the first word in the text).
data:image/s3,"s3://crabby-images/b54d6/b54d68fcfcdd4c3a47be158825747aadbe91adc0" alt="example_tokenizer"
Example from the aida_test_dataset:
You can notice that the JAPAN position within the list should be [4,6] instead of [5,7], and that the Rugby Union mention should be [0,5] instead of [0,6]. So I was simply wondering if that is normal ?
Also if it's not too much, could you explain what would happen with entities.json if the datasets for training don't have "candidates" key ?
Finally, about the mentions.json, there are a ton of '!', should I replicate that in my own mentions.json ?
Thank you for your help and for this repo !
The text was updated successfully, but these errors were encountered: