[Bug]: Qwen2.5-7B-Instruct 支持 128K tokens,为啥 config 里面写 32k呢,如果我基于qwen2.5 要训练一个大于 32k 的模型,需要怎么做呢 #1134
Replies: 3 comments 2 replies
-
Please first read the README/Modelcard: https://huggingface.co/Qwen/Qwen2.5-7B-Instruct#processing-long-texts |
Beta Was this translation helpful? Give feedback.
-
@jklj077 这是推理的时候吧,如果是基于他做 sft,直接把max_position_embeddings 改成 128K 行吗,我的训练文本比较长 |
Beta Was this translation helpful? Give feedback.
-
|
Beta Was this translation helpful? Give feedback.
-
Model Series
Qwen2.5
What are the models used?
Qwen2.5-7B-Instruct
What is the scenario where the problem happened?
sft
Is this a known issue?
Information about environment
no
Log output
Description
Steps to reproduce
This happens to Qwen2.5-xB-Instruct-xxx and xxx.
The problem can be reproduced with the following steps:
Beta Was this translation helpful? Give feedback.
All reactions