-
Notifications
You must be signed in to change notification settings - Fork 298
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix roberta retokenization error #982
Conversation
Hello @HaokunLiu! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found: There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻 You can repair most issues by installing black and running: Comment last updated at 2020-01-08 01:46:27 UTC |
[[0], [1], [2], [3, 4], [5], [6, 7], [8], [9, 10, 11]], | ||
[[0], [1, 2, 3], [4], [5], [6], [7], [8], [9, 10, 11], [12], [13], [14, 15]], | ||
[[0], [1], [2], [3, 4], [5, 6], [7], [8], [9, 10, 11]], | ||
[[0, 1], [2, 3], [4], [5], [6], [7], [8], [9, 10, 11], [12], [13], [14, 15]], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wasn't computing this manually. I just printed out the output and skim over it.
I know this isn't how test cases are suppose to be made.
Me trying to save 5 minutes in the wrong place has wasted everyone else hours, there can't be enough apology for this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@pruksmhc: Is there a reason this hasn't been merged? |
* fix roberta tokenization error * format Co-authored-by: Yada Pruksachatkun <yp913@nyu.edu>
#934 #903 #904 #954
I'm very sorry about this bug. I thought I have fixed retokenization last time. But actually there was a typo left. I should have found this if I checked the test cases carefully.
This bug will cause the roberta retokenization to be less accurate, possibly negatively affect the performance on tasks that use realign index.