Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The Document of LongT5 confilcts with and its example code of prefix #18502

Closed
GabrielLin opened this issue Aug 6, 2022 · 7 comments
Closed
Labels

Comments

@GabrielLin
Copy link

GabrielLin commented Aug 6, 2022

System Info

All.

Who can help?

@patrickvonplaten

Reproduction

See https://huggingface.co/docs/transformers/main/en/model_doc/longt5

Expected behavior

In the above document, it said Unlike the T5 model, LongT5 does not use a task prefix. Furthermore, it uses a different pre-training objective inspired by the pre-training of [PegasusForConditionalGeneration].. But in the example code of LongT5ForConditionalGeneration, there is a prefix of summarize: . I am confused about how to use LongT5 in different down tasks. Could you please help? Thanks.

@GabrielLin GabrielLin added the bug label Aug 6, 2022
@GabrielLin
Copy link
Author

@stancld @patil-suraj Could you please help to solve this issue and tell me how to set and use special down tasks for LongT5? Thanks.

@stancld
Copy link
Contributor

stancld commented Aug 15, 2022

Hi @GabrielLin, with LongT5 model no prefix should be added to the input sentence. The doc example seems not to be accurate.

@GabrielLin
Copy link
Author

Hi, @stancld . Thank you for your reply. Could you please indicate how to use [PegasusForConditionalGeneration] for different down-tasks and help to fix the example code? I have no ideas.

@stancld
Copy link
Contributor

stancld commented Aug 27, 2022

Hi, @stancld . Thank you for your reply. Could you please indicate how to use [PegasusForConditionalGeneration] for different down-tasks and help to fix the example code? I have no ideas.

Hi, example should have been already fixed by @patrickvonplaten. Fine-tuning on different down-tasks should be pretty standard. There's no prefix, you can thus use same techniques for models like BART, GPT-2, etc. :] However, the final performance is questionable as, AFAIK, only summarization and Q&A has been investigated so far.

@GabrielLin
Copy link
Author

Hi, @stancld . Thank you for your reply. Could you please indicate how to use [PegasusForConditionalGeneration] for different down-tasks and help to fix the example code? I have no ideas.

Hi, example should have been already fixed by @patrickvonplaten. Fine-tuning on different down-tasks should be pretty standard. There's no prefix, you can thus use same techniques for models like BART, GPT-2, etc. :] However, the final performance is questionable as, AFAIK, only summarization and Q&A has been investigated so far.

Thank you @stancld . Thank @patrickvonplaten . I have one more question. If having the prefix, I consider that different down-tasks can be fine-tuned in the same model. Now, without the prefix, should we use separated model for different down-tasks? Thanks.

@patrickvonplaten
Copy link
Contributor

Hey @GabrielLin

That depends on how different the use cases are and what your limitations are exactly. In general, I'd say yes you should use different fine-tuned models for different tasks

@GabrielLin
Copy link
Author

@patrickvonplaten Got it. Thanks. This issue has been fixed and closed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants