trouble with bert-based model #3

CuongNN218 · 2023-07-06T06:07:54Z

Dear Dr.Arazd @arazd ,
Thanks for your great work. I'm trying to replicate your result in Table.1 with the order 4 (5 tasks - bert base-uncased model) - CL setting (full data) in the main paper with the following cmd:
python train_cl2.py --task_list ag yelp_review_full amazon yahoo dbpedia --prefix_MLP residual_MLP2 --lr 1e-4 --num_epochs 40 --freeze_weights 1 --freeze_except word_embeddings \ --prompt_tuning 1 --prefix_len 20 --seq_len 450 --one_head 0 \ --model_name bert-base-uncased --early_stopping 1 \ --save_name BERT_order_4_run1 --save_dir ./results

However, when the progressive prompt model evaluate the accuracy on all dataset, it thrown an error like this when it started evaluating on yahoo dataset:
/opt/conda/conda-bld/pytorch_1639180487213/work/aten/src/ATen/native/cuda/Indexing.cu:699: indexSelectLargeIndex: block: [60,0,0], thread: [96,0,0] Assertion srcIndex < srcSelectDimSize failed.

This issue only appears in the evaluation of the 4th task, I tried other settings such as shorter sequence (2, 3 tasks), removing ResMLP, ... but it works as normal. I also tried to print the pattern of the input ids, token_type_ids, and position_ids but the 4th task's pattern is similar to previous tasks.

the error stems from this line in your repo: https://github.com/arazd/ProgressivePrompts/blob/01572d6a73c0576b070ceee00dbe4f5bc278423f/BERT_codebase/model_utils.py#L576

Could you give me some insight about this problem?
I really appreciate if you can help me to fix that.

P/s: After troubleshooting the source of those problem, I found that it only occur when the sequence length longer than 4, and evaluating with full validation set.

The text was updated successfully, but these errors were encountered:

CuongNN218 · 2023-07-06T15:35:56Z

Hi,
after troubleshooting the problem, I found that this issue stems from the embedding size of bert-base-uncased model. In particular, if we set the seq_len =450, after prepreding soft prompts with 20 tokens per task from 4 previous tasks, the embedding size is seq_len = 450 + 20 * 4 = 530 > 512, 512 is the maximum size of embedding for bert-base-uncased model. Would you like to share the len of the input sequence used to reproduce table1.b? I haven't found it in the main paper. Thanks.

mingyang-wang26 · 2023-10-13T14:33:42Z

Hi, after troubleshooting the problem, I found that this issue stems from the embedding size of bert-base-uncased model. In particular, if we set the seq_len =450, after prepreding soft prompts with 20 tokens per task from 4 previous tasks, the embedding size is seq_len = 450 + 20 * 4 = 530 > 512, 512 is the maximum size of embedding for bert-base-uncased model. Would you like to share the len of the input sequence used to reproduce table1.b? I haven't found it in the main paper. Thanks.

I also met the same problem, BERT model cannot handle with the hyperparamters given in this repository: seq_len=450 with the prompt length=20. I'm curious what hyperparameters did the authors actually use to get the result in the paper.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

trouble with bert-based model #3

trouble with bert-based model #3

CuongNN218 commented Jul 6, 2023 •

edited

Loading

CuongNN218 commented Jul 6, 2023

mingyang-wang26 commented Oct 13, 2023 •

edited

Loading

trouble with bert-based model #3

trouble with bert-based model #3

Comments

CuongNN218 commented Jul 6, 2023 • edited Loading

CuongNN218 commented Jul 6, 2023

mingyang-wang26 commented Oct 13, 2023 • edited Loading

CuongNN218 commented Jul 6, 2023 •

edited

Loading

mingyang-wang26 commented Oct 13, 2023 •

edited

Loading