Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

about some small bug of prompt_tuning.py #1032

Closed
4 tasks
duxiaoyu555 opened this issue Oct 17, 2023 · 0 comments · Fixed by #1053
Closed
4 tasks

about some small bug of prompt_tuning.py #1032

duxiaoyu555 opened this issue Oct 17, 2023 · 0 comments · Fixed by #1053

Comments

@duxiaoyu555
Copy link

duxiaoyu555 commented Oct 17, 2023

System Info

peft ==0.5.0
python == 3.9
transformers==4.33.1

Who can help?

No response

Information

  • The official example scripts
  • My own modified scripts

Tasks

  • An officially supported task in the examples folder
  • My own task or dataset (give details below)

Reproduction

from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PromptTuningConfig, get_peft_model, TaskType, PromptTuningInit
import torch
tokenizer = AutoTokenizer.from_pretrained("/upp/xgen/xgen-7b-8k-base", trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained("/upp/xgen/xgen-7b-8k-base", torch_dtype=torch.bfloat16,trust_remote_code=True)
config = PromptTuningConfig(task_type=TaskType.CAUSAL_LM,
                            prompt_tuning_init=PromptTuningInit.TEXT,
                            prompt_tuning_init_text="下面是一段人与机器人的对话。",
                            num_virtual_tokens=len(tokenizer("下面是一段人与机器人的对话。")["input_ids"]),
                            tokenizer_name_or_path="xxxxx")  #(local file)

model = get_peft_model(model, config)

Expected behavior

i have an advice of get_peft_model this method , in this function ,have an class PromptEmbedding in prompt_tuning.py
and line 112 tokenizer = AutoTokenizer.from_pretrained(config.tokenizer_name_or_path) should have an args trust_remote_code=True
i met an issue Tokenizer class xxxx does not exist or is not currently imported. because of it .

BenjaminBossan added a commit to BenjaminBossan/peft that referenced this issue Oct 25, 2023
Fixes huggingface#1032

Description

Currently, when using prompt tuning with TEXT, we call
AutoTokenizer.from_pretrained with only the model id. However, it may be
necessary to pass additional arguments, e.g. trust_remote_code=True.
This fix allows to pass more arguments by setting the argument
tokenizer_kwargs in the PromptTuningConfig.

I also added a check that when tokenizer_kwargs is set, the TEXT option
is actually being used.

Moreover, I noticed that we have no tests for prompt tuning with TEXT,
so I added those tests for decoder models.

Additional changes

There was a bug in PromptEmbedding where the device of the
init_token_ids was not set, which resulted in errors when using CUDA.

Finally, I removed an unused constant CONFIG_CLASSES from a test.
pacman100 pushed a commit that referenced this issue Nov 14, 2023
Fixes #1032

Description

Currently, when using prompt tuning with TEXT, we call
AutoTokenizer.from_pretrained with only the model id. However, it may be
necessary to pass additional arguments, e.g. trust_remote_code=True.
This fix allows to pass more arguments by setting the argument
tokenizer_kwargs in the PromptTuningConfig.

I also added a check that when tokenizer_kwargs is set, the TEXT option
is actually being used.

Moreover, I noticed that we have no tests for prompt tuning with TEXT,
so I added those tests for decoder models.

Additional changes

There was a bug in PromptEmbedding where the device of the
init_token_ids was not set, which resulted in errors when using CUDA.

Finally, I removed an unused constant CONFIG_CLASSES from a test.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant