Prompt tuning: Allow to pass additional args to AutoTokenizer.from_pretrained #1053
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Fixes #1032
Description
Currently, when using prompt tuning with
TEXT
, we callAutoTokenizer.from_pretrained
with only the model id. However, it may be necessary to pass additional arguments, e.g.trust_remote_code=True
. This fix allows to pass more arguments by setting the argumenttokenizer_kwargs
in thePromptTuningConfig
.I also added a check that when
tokenizer_kwargs
is set, theTEXT
option is actually being used.Moreover, I noticed that we have no tests for prompt tuning with
TEXT
, so I added those tests for decoder models.Additional changes
There was a bug in
PromptEmbedding
where the device of the init_token_ids was not set, which resulted in errors when using CUDA.Finally, I removed an unused constant
CONFIG_CLASSES
from a test.