Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prompt tuning: Allow to pass additional args to AutoTokenizer.from_pretrained #1053

Conversation

BenjaminBossan
Copy link
Member

Fixes #1032

Description

Currently, when using prompt tuning with TEXT, we call AutoTokenizer.from_pretrained with only the model id. However, it may be necessary to pass additional arguments, e.g. trust_remote_code=True. This fix allows to pass more arguments by setting the argument tokenizer_kwargs in the PromptTuningConfig.

I also added a check that when tokenizer_kwargs is set, the TEXT option is actually being used.

Moreover, I noticed that we have no tests for prompt tuning with TEXT, so I added those tests for decoder models.

Additional changes

There was a bug in PromptEmbedding where the device of the init_token_ids was not set, which resulted in errors when using CUDA.

Finally, I removed an unused constant CONFIG_CLASSES from a test.

Fixes huggingface#1032

Description

Currently, when using prompt tuning with TEXT, we call
AutoTokenizer.from_pretrained with only the model id. However, it may be
necessary to pass additional arguments, e.g. trust_remote_code=True.
This fix allows to pass more arguments by setting the argument
tokenizer_kwargs in the PromptTuningConfig.

I also added a check that when tokenizer_kwargs is set, the TEXT option
is actually being used.

Moreover, I noticed that we have no tests for prompt tuning with TEXT,
so I added those tests for decoder models.

Additional changes

There was a bug in PromptEmbedding where the device of the
init_token_ids was not set, which resulted in errors when using CUDA.

Finally, I removed an unused constant CONFIG_CLASSES from a test.
@HuggingFaceDocBuilderDev
Copy link

HuggingFaceDocBuilderDev commented Oct 25, 2023

The documentation is not available anymore as the PR was closed or merged.

Copy link
Contributor

@pacman100 pacman100 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work @BenjaminBossan on adding the support for tokenizer kwargs when using Prompt Tuning along with corresponding tests! ✨

@pacman100 pacman100 merged commit d350a00 into huggingface:main Nov 14, 2023
@BenjaminBossan BenjaminBossan deleted the allow-args-for-autotokenizer-from_pretrained-in-prompt-tuning branch November 14, 2023 11:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

about some small bug of prompt_tuning.py
3 participants