在推理时，chat函数中加入temperature参数，模型就报错 #153

zhihao-chen · 2023-08-10T06:19:43Z

是否已有关于该错误的issue或讨论？ | Is there an existing issue / discussion for this?

我已经搜索过已有的issues和讨论 | I have searched the existing issues / discussions

该问题是否在FAQ中有解答？ | Is there an existing answer for this in FAQ?

我已经搜索过FAQ | I have searched FAQ

当前行为 | Current Behavior

当指定temperature参数时就报错，取消了就不报错。

File "/home/aiteam/work2/chenzhihao/kefu_dialogue/examples/qwen_interact.py", line 70, in chatbot
response, history = model.chat(tokenizer, query, history=history, system=SYSTEM_PROMPT, top_p=0.75, temperature=0.3)
File "/home/aiteam/.cache/huggingface/modules/transformers_modules/Qwen-7B-Chat/modeling_qwen.py", line 1010, in chat
outputs = self.generate(
File "/home/aiteam/.cache/huggingface/modules/transformers_modules/Qwen-7B-Chat/modeling_qwen.py", line 1120, in generate
return super().generate(
File "/home/aiteam/anaconda3/envs/llm-py10/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "/home/aiteam/work2/chenzhihao/transformers/src/transformers/generation/utils.py", line 1615, in generate
return self.sample(
File "/home/aiteam/work2/chenzhihao/transformers/src/transformers/generation/utils.py", line 2773, in sample
next_tokens = torch.multinomial(probs, num_samples=1).squeeze(1)
RuntimeError: probability tensor contains either inf, nan or element < 0

期望行为 | Expected Behavior

No response

复现方法 | Steps To Reproduce

No response

运行环境 | Environment

- OS:
- Python:
- Transformers:
- PyTorch:
- CUDA (`python -c 'import torch; print(torch.version.cuda)'`):

备注 | Anything else?

No response

The text was updated successfully, but these errors were encountered:

fyabc · 2023-08-10T09:28:13Z

您好 @zhihao-chen ，我们这边没能复现您的问题，可以提供更详细的复现代码吗？

zhihao-chen · 2023-08-10T09:46:01Z

from transformers.generation import GenerationConfig
from transformers import AutoModelForCausalLM,AutoTokenizer
import torch
import os
import platform

os_name = platform.system()
clear_command = 'cls' if os_name == 'Windows' else 'clear'
stop_stream = False

def signal_handler():
global stop_stream
stop_stream = True

device = torch.device("cuda:6")
model = AutoModelForCausalLM.from_pretrained("Qwen/Qwen-7B-Chat",
torch_dtype=torch.float16,
trust_remote_code=True,
low_cpu_mem_usage=True, device_map=device)
tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen-7B-Chat", use_fast=False, trust_remote_code=True)
model.generation_config = GenerationConfig.from_pretrained("Qwen/Qwen-7B-Chat", trust_remote_code=True)
model.eval()

history = []
while True:
    query = input("用户：")
    if not query:
        continue
    if query == 'stop':
        break
    if query == "clear":
        history = []
        os.system(clear_command)
        continue
    response, history = model.chat(tokenizer, query, history=history, system="",
                                   top_p=0.75, temperature=0.3)
    print("AI：", response)

fyabc · 2023-08-10T10:44:51Z

from transformers.generation import GenerationConfig from transformers import AutoModelForCausalLM,AutoTokenizer import torch import os import platform

os_name = platform.system() clear_command = 'cls' if os_name == 'Windows' else 'clear' stop_stream = False

def signal_handler(): global stop_stream stop_stream = True

device = torch.device("cuda:6") model = AutoModelForCausalLM.from_pretrained("Qwen/Qwen-7B-Chat", torch_dtype=torch.float16, trust_remote_code=True, low_cpu_mem_usage=True, device_map=device) tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen-7B-Chat", use_fast=False, trust_remote_code=True) model.generation_config = GenerationConfig.from_pretrained("Qwen/Qwen-7B-Chat", trust_remote_code=True) model.eval()
history = []
while True:
    query = input("用户：")
    if not query:
        continue
    if query == 'stop':
        break
    if query == "clear":
        history = []
        os.system(clear_command)
        continue
    response, history = model.chat(tokenizer, query, history=history, system="",
                                   top_p=0.75, temperature=0.3)
    print("AI：", response)

您好，可以提供下报错的prompt吗？

zhihao-chen · 2023-08-10T10:51:39Z

我就输入了“你好”就报错了

zhihao-chen · 2023-08-10T10:52:10Z

没有使用flash-att，是在V100上

fyabc · 2023-08-10T11:14:59Z

可以提供一下Python，PyTorch，transformers，cuda等版本信息吗？

另外，根据这个issue ，给model.chat()显式传入do_sample=False能否解决此问题呢？

zhihao-chen · 2023-08-11T02:08:59Z

do_sample=False可以解决，但这不就是greedy search了吗，也用不着temperature参数了。
python= 3.10
pytorch = 2.0.1
transformers = 4.32.0.dev0
cuda 11.7
GPU V100

fyabc · 2023-08-11T03:57:11Z

do_sample=False可以解决，但这不就是greedy search了吗，也用不着temperature参数了。 python= 3.10 pytorch = 2.0.1 transformers = 4.32.0.dev0 cuda 11.7 GPU V100

这边目前没法复现这个问题，可以麻烦您把transformers版本改为4.31.0再试一下吗？

pei55 · 2023-08-21T03:24:32Z

我也碰到了相同错误，3090的显卡，transformers版本为4.31.0

JustinLin610 · 2023-10-07T06:05:33Z

已经有较多相关issue。如果是量化模型，建议不要调整温度，通过topp调确定性。如果是非量化模型，建议还是使用bf16而非fp16，并且调温度也尽量不要调过小，容易有精度溢出的问题，fp16尤其容易出现。

aleimu added a commit to aleimu/langchain-ChatGLM that referenced this issue Aug 10, 2023

[fix]QwenLM/Qwen#153

5eb19ff

yangapku assigned fyabc Aug 10, 2023

fyabc assigned yangapku and JustinLin610 Aug 11, 2023

JustinLin610 closed this as completed Oct 7, 2023

jklj077 mentioned this issue Nov 14, 2023

batch推理报错RuntimeError: probability tensor contains either inf, nan or element < 0 #620

Closed

2 tasks

JingyiChang mentioned this issue Dec 22, 2023

[BUG] RuntimeError: probability tensor contains either inf, nan or element < 0 #848

Closed

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

在推理时，chat函数中加入temperature参数，模型就报错 #153

在推理时，chat函数中加入temperature参数，模型就报错 #153

zhihao-chen commented Aug 10, 2023

fyabc commented Aug 10, 2023

zhihao-chen commented Aug 10, 2023

fyabc commented Aug 10, 2023

zhihao-chen commented Aug 10, 2023

zhihao-chen commented Aug 10, 2023

fyabc commented Aug 10, 2023 •

edited

Loading

zhihao-chen commented Aug 11, 2023

fyabc commented Aug 11, 2023

pei55 commented Aug 21, 2023

JustinLin610 commented Oct 7, 2023

在推理时，chat函数中加入temperature参数，模型就报错 #153

在推理时，chat函数中加入temperature参数，模型就报错 #153

Comments

zhihao-chen commented Aug 10, 2023

是否已有关于该错误的issue或讨论？ | Is there an existing issue / discussion for this?

该问题是否在FAQ中有解答？ | Is there an existing answer for this in FAQ?

当前行为 | Current Behavior

期望行为 | Expected Behavior

复现方法 | Steps To Reproduce

运行环境 | Environment

备注 | Anything else?

fyabc commented Aug 10, 2023

zhihao-chen commented Aug 10, 2023

fyabc commented Aug 10, 2023

zhihao-chen commented Aug 10, 2023

zhihao-chen commented Aug 10, 2023

fyabc commented Aug 10, 2023 • edited Loading

zhihao-chen commented Aug 11, 2023

fyabc commented Aug 11, 2023

pei55 commented Aug 21, 2023

JustinLin610 commented Oct 7, 2023

fyabc commented Aug 10, 2023 •

edited

Loading