About the max length of the "PMC_LLaMA_13B" #29

dyxohjl666 · 2024-01-24T08:05:36Z

Hi @chaoyi-wu, I notice that in the paper the max context length is set as 2048. But when I load the model, it claims that the max length is 512?
Do you have a version allowing longer sequence input? Or this one is already trained for long input, so I can just change "model_max_length" in tokenizer configurations?

chaoyi-wu · 2024-03-11T06:05:42Z

It is pre-trained with 2048, but instruction tuned with 512. It will be totally ok to expand the "model_max_length" if you are intended to fine-tune our mode while if you are going to perform zero-shot prompting, it can also work but may hurt the model performance.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

About the max length of the "PMC_LLaMA_13B" #29

About the max length of the "PMC_LLaMA_13B" #29

dyxohjl666 commented Jan 24, 2024

chaoyi-wu commented Mar 11, 2024

About the max length of the "PMC_LLaMA_13B" #29

About the max length of the "PMC_LLaMA_13B" #29

Comments

dyxohjl666 commented Jan 24, 2024

chaoyi-wu commented Mar 11, 2024