-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
在推理时,chat函数中加入temperature参数,模型就报错 #153
Comments
您好 @zhihao-chen ,我们这边没能复现您的问题,可以提供更详细的复现代码吗? |
from transformers.generation import GenerationConfig os_name = platform.system() def signal_handler(): device = torch.device("cuda:6")
|
您好,可以提供下报错的prompt吗? |
我就输入了“你好”就报错了 |
没有使用flash-att,是在V100上 |
可以提供一下Python,PyTorch,transformers,cuda等版本信息吗? 另外,根据 这个issue ,给 |
do_sample=False可以解决,但这不就是greedy search了吗,也用不着temperature参数了。 |
这边目前没法复现这个问题,可以麻烦您把transformers版本改为4.31.0再试一下吗? |
我也碰到了相同错误,3090的显卡,transformers版本为4.31.0 |
已经有较多相关issue。如果是量化模型,建议不要调整温度,通过topp调确定性。如果是非量化模型,建议还是使用bf16而非fp16,并且调温度也尽量不要调过小,容易有精度溢出的问题,fp16尤其容易出现。 |
是否已有关于该错误的issue或讨论? | Is there an existing issue / discussion for this?
该问题是否在FAQ中有解答? | Is there an existing answer for this in FAQ?
当前行为 | Current Behavior
当指定temperature参数时就报错,取消了就不报错。
File "/home/aiteam/work2/chenzhihao/kefu_dialogue/examples/qwen_interact.py", line 70, in chatbot
response, history = model.chat(tokenizer, query, history=history, system=SYSTEM_PROMPT, top_p=0.75, temperature=0.3)
File "/home/aiteam/.cache/huggingface/modules/transformers_modules/Qwen-7B-Chat/modeling_qwen.py", line 1010, in chat
outputs = self.generate(
File "/home/aiteam/.cache/huggingface/modules/transformers_modules/Qwen-7B-Chat/modeling_qwen.py", line 1120, in generate
return super().generate(
File "/home/aiteam/anaconda3/envs/llm-py10/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "/home/aiteam/work2/chenzhihao/transformers/src/transformers/generation/utils.py", line 1615, in generate
return self.sample(
File "/home/aiteam/work2/chenzhihao/transformers/src/transformers/generation/utils.py", line 2773, in sample
next_tokens = torch.multinomial(probs, num_samples=1).squeeze(1)
RuntimeError: probability tensor contains either
inf
,nan
or element < 0期望行为 | Expected Behavior
No response
复现方法 | Steps To Reproduce
No response
运行环境 | Environment
备注 | Anything else?
No response
The text was updated successfully, but these errors were encountered: