You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello everyone, I am tring to use luke-large for question answering.
I met serveral issues when finetune the model by SQAUD-like data, most of the issues comes by not supporting fast tokenizer.
So I am wondering if luke will support fast tokenizer in the future, or is any ways to solve the issues.
Thank you so much!
The text was updated successfully, but these errors were encountered:
Hi!
If you refer to the following blog, it seems that offset_mapping can be used with LUKE. It has not been confirmed whether misalignment does not occur at any time. sorry
I thought the same as @TrickyyH. Apart from offset_mapping, for instance, the behaviour of return_overflowing_tokens differs between slow and fast tokenisers. As a result, it becomes difficult to handle long texts in tasks like NER and QA, which LUKE excels at. I would be pleased if you could accommodate the fast tokeniser.
One possible workaround is to use the fast version of the base tokenizer, such as the Fast version of RobertaTokenizer, which LukeTokenizer is based on 'they have the same subword vocabulary).
However, this approach may not support entity-related outputs, which would require additional code to be written.
Hello everyone, I am tring to use luke-large for question answering.
I met serveral issues when finetune the model by SQAUD-like data, most of the issues comes by not supporting fast tokenizer.
So I am wondering if luke will support fast tokenizer in the future, or is any ways to solve the issues.
Thank you so much!
The text was updated successfully, but these errors were encountered: