-
<|im_start|>system 预训练时整个序列的position id,是从0开始到结尾(0....N),还是每轮从0开始reset(0.....N1,0,....N2,0.....,N3)? 我理解SFT的时候应该得和预训练保持一致,才能取得最好的效果? |
Beta Was this translation helpful? Give feedback.
Answered by
jklj077
Dec 4, 2024
Replies: 1 comment
-
从0开始到结尾(0....N)。 |
Beta Was this translation helpful? Give feedback.
0 replies
Answer selected by
zzzm83
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
从0开始到结尾(0....N)。
这个一般不需要手动指定的,而且Qwen使用的是RoPE,相对位置编码,跟绝对位置关系不大。