You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Dear Dr.Arazd @arazd ,
Thanks for your great work. I'm trying to replicate your result in Table.1 with the order 4 (5 tasks - bert base-uncased model) - CL setting (full data) in the main paper with the following cmd: python train_cl2.py --task_list ag yelp_review_full amazon yahoo dbpedia --prefix_MLP residual_MLP2 --lr 1e-4 --num_epochs 40 --freeze_weights 1 --freeze_except word_embeddings \ --prompt_tuning 1 --prefix_len 20 --seq_len 450 --one_head 0 \ --model_name bert-base-uncased --early_stopping 1 \ --save_name BERT_order_4_run1 --save_dir ./results
However, when the progressive prompt model evaluate the accuracy on all dataset, it thrown an error like this when it started evaluating on yahoo dataset: /opt/conda/conda-bld/pytorch_1639180487213/work/aten/src/ATen/native/cuda/Indexing.cu:699: indexSelectLargeIndex: block: [60,0,0], thread: [96,0,0] Assertion srcIndex < srcSelectDimSize failed.
This issue only appears in the evaluation of the 4th task, I tried other settings such as shorter sequence (2, 3 tasks), removing ResMLP, ... but it works as normal. I also tried to print the pattern of the input ids, token_type_ids, and position_ids but the 4th task's pattern is similar to previous tasks.
Could you give me some insight about this problem?
I really appreciate if you can help me to fix that.
P/s: After troubleshooting the source of those problem, I found that it only occur when the sequence length longer than 4, and evaluating with full validation set.
The text was updated successfully, but these errors were encountered:
Hi,
after troubleshooting the problem, I found that this issue stems from the embedding size of bert-base-uncased model. In particular, if we set the seq_len =450, after prepreding soft prompts with 20 tokens per task from 4 previous tasks, the embedding size is seq_len = 450 + 20 * 4 = 530 > 512, 512 is the maximum size of embedding for bert-base-uncased model. Would you like to share the len of the input sequence used to reproduce table1.b? I haven't found it in the main paper. Thanks.
Hi, after troubleshooting the problem, I found that this issue stems from the embedding size of bert-base-uncased model. In particular, if we set the seq_len =450, after prepreding soft prompts with 20 tokens per task from 4 previous tasks, the embedding size is seq_len = 450 + 20 * 4 = 530 > 512, 512 is the maximum size of embedding for bert-base-uncased model. Would you like to share the len of the input sequence used to reproduce table1.b? I haven't found it in the main paper. Thanks.
I also met the same problem, BERT model cannot handle with the hyperparamters given in this repository: seq_len=450 with the prompt length=20. I'm curious what hyperparameters did the authors actually use to get the result in the paper.
Dear Dr.Arazd @arazd ,
Thanks for your great work. I'm trying to replicate your result in Table.1 with the order 4 (5 tasks - bert base-uncased model) - CL setting (full data) in the main paper with the following cmd:
python train_cl2.py --task_list ag yelp_review_full amazon yahoo dbpedia --prefix_MLP residual_MLP2 --lr 1e-4 --num_epochs 40 --freeze_weights 1 --freeze_except word_embeddings \ --prompt_tuning 1 --prefix_len 20 --seq_len 450 --one_head 0 \ --model_name bert-base-uncased --early_stopping 1 \ --save_name BERT_order_4_run1 --save_dir ./results
However, when the progressive prompt model evaluate the accuracy on all dataset, it thrown an error like this when it started evaluating on yahoo dataset:
/opt/conda/conda-bld/pytorch_1639180487213/work/aten/src/ATen/native/cuda/Indexing.cu:699: indexSelectLargeIndex: block: [60,0,0], thread: [96,0,0] Assertion
srcIndex < srcSelectDimSizefailed.
This issue only appears in the evaluation of the 4th task, I tried other settings such as shorter sequence (2, 3 tasks), removing ResMLP, ... but it works as normal. I also tried to print the pattern of the input ids, token_type_ids, and position_ids but the 4th task's pattern is similar to previous tasks.
the error stems from this line in your repo: https://github.com/arazd/ProgressivePrompts/blob/01572d6a73c0576b070ceee00dbe4f5bc278423f/BERT_codebase/model_utils.py#L576
Could you give me some insight about this problem?
I really appreciate if you can help me to fix that.
P/s: After troubleshooting the source of those problem, I found that it only occur when the sequence length longer than 4, and evaluating with full validation set.
The text was updated successfully, but these errors were encountered: