-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Description
System Info
When ever I am using peft_config = PrefixTuningConfig(
peft_type="PREFIX_TUNING",
inference_mode=False,
task_type=TaskType.SEQ_2_SEQ_LM,
num_virtual_tokens=20,
token_dim=768,
num_transformer_submodules=1,
num_attention_heads=12,
num_layers=12,
encoder_hidden_size=768,
) this setting in the flan T5 XXL model, while training I am getting in forward
raise ValueError(
ValueError: There should be 4 past states. 2 (past / key) for cross attention. Got 2 past key / value states, Could you please help me to resolve this
Who can help?
When ever I am using peft_config = PrefixTuningConfig(
peft_type="PREFIX_TUNING",
inference_mode=False,
task_type=TaskType.SEQ_2_SEQ_LM,
num_virtual_tokens=20,
token_dim=768,
num_transformer_submodules=1,
num_attention_heads=12,
num_layers=12,
encoder_hidden_size=768,
) this setting in the flan T5 XXL model, while training I am getting in forward
raise ValueError(
ValueError: There should be 4 past states. 2 (past / key) for cross attention. Got 2 past key / value states, Could you please help me to resolve this
Information
- The official example scripts
- My own modified scripts
Tasks
- An officially supported task in the
examplesfolder (such as GLUE/SQuAD, ...) - My own task or dataset (give details below)
Reproduction
When ever I am using peft_config = PrefixTuningConfig(
peft_type="PREFIX_TUNING",
inference_mode=False,
task_type=TaskType.SEQ_2_SEQ_LM,
num_virtual_tokens=20,
token_dim=768,
num_transformer_submodules=1,
num_attention_heads=12,
num_layers=12,
encoder_hidden_size=768,
) this setting in the flan T5 XXL model, while training I am getting in forward
raise ValueError(
ValueError: There should be 4 past states. 2 (past / key) for cross attention. Got 2 past key / value states, Could you please help me to resolve this
Expected behavior
When ever I am using peft_config = PrefixTuningConfig(
peft_type="PREFIX_TUNING",
inference_mode=False,
task_type=TaskType.SEQ_2_SEQ_LM,
num_virtual_tokens=20,
token_dim=768,
num_transformer_submodules=1,
num_attention_heads=12,
num_layers=12,
encoder_hidden_size=768,
) this setting in the flan T5 XXL model, while training I am getting in forward
raise ValueError(
ValueError: There should be 4 past states. 2 (past / key) for cross attention. Got 2 past key / value states, Could you please help me to resolve this