-
Notifications
You must be signed in to change notification settings - Fork 28.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
The Document of LongT5 confilcts with and its example code of prefix #18502
Comments
@stancld @patil-suraj Could you please help to solve this issue and tell me how to set and use special down tasks for LongT5? Thanks. |
Hi @GabrielLin, with LongT5 model no prefix should be added to the input sentence. The doc example seems not to be accurate. |
Hi, @stancld . Thank you for your reply. Could you please indicate how to use |
Hi, example should have been already fixed by @patrickvonplaten. Fine-tuning on different down-tasks should be pretty standard. There's no prefix, you can thus use same techniques for models like BART, GPT-2, etc. :] However, the final performance is questionable as, AFAIK, only summarization and Q&A has been investigated so far. |
Thank you @stancld . Thank @patrickvonplaten . I have one more question. If having the prefix, I consider that different down-tasks can be fine-tuned in the same model. Now, without the prefix, should we use separated model for different down-tasks? Thanks. |
Hey @GabrielLin That depends on how different the use cases are and what your limitations are exactly. In general, I'd say yes you should use different fine-tuned models for different tasks |
@patrickvonplaten Got it. Thanks. This issue has been fixed and closed. |
System Info
All.
Who can help?
@patrickvonplaten
Reproduction
See https://huggingface.co/docs/transformers/main/en/model_doc/longt5
Expected behavior
In the above document, it said
Unlike the T5 model, LongT5 does not use a task prefix. Furthermore, it uses a different pre-training objective inspired by the pre-training of [PegasusForConditionalGeneration].
. But in the example code ofLongT5ForConditionalGeneration
, there is a prefix ofsummarize:
. I am confused about how to use LongT5 in different down tasks. Could you please help? Thanks.The text was updated successfully, but these errors were encountered: