RuntimeError: probability tensor contains either `inf`, `nan` or element < 0 #56

pagyyuan · 2023-09-08T03:14:17Z

部署baichuan-13B-chat模型报错

ywancit · 2023-09-08T03:41:11Z

我是可以正常用原始模型，但是8bit量化报和你一样的错，https://github.com/baichuan-inc/Baichuan2/issues/48#issue-1885592066。
有解决方案了可以互相通知一下。

bihui9968 · 2023-09-08T09:14:58Z

遇到了同样的问题，有解决方案了吗

ChiQiuHong · 2023-09-19T14:22:18Z

@pagyyuan @ywancit @bihui9968
model = AutoModelForCausalLM.from_pretrained(quant8_saved_dir, load_in_8bit=True, device_map="auto", trust_remote_code=True)
问题应该出自device_map="auto"这里，我的机器内存和GPU显存都较小，因此它会将模型分别加载到cpu和cuda里，接下来保存的时候，就没有把全部权重都保存下来（保存下来的'pytorch_model'只有8GB左右，这并不是一个正常的大小）。
我将把模型全部加载到cpu里：device_map="cpu"，保存下来的’pytorch_model‘大小有13.9GB，推理正常没有报错。

pagyyuan · 2023-09-20T02:10:04Z

@ChiQiuHong 我将device_map由"auto"改为"cpu"后会报新的错误：ValueError: If passing a string for device_map, please choose 'auto', 'balanced', 'balanced_low_0' or 'sequential'.

pagyyuan · 2023-09-20T02:27:32Z

我将报错的地方打印出来：有一个tensor是这样的：tensor([[nan, nan, nan, ..., nan, nan, nan]], device='cuda:0',
dtype=torch.float16)

baolixiong · 2023-10-23T02:58:59Z

请问解决了吗

qiu404 · 2023-11-24T09:13:50Z

#291 可以看下这个能不能解你的问题

liwenju0 mentioned this issue Nov 25, 2023

解决已有issues共性问题，包括逐条推理不一致、score nan、半精度预测不一致等问题 #291

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RuntimeError: probability tensor contains either `inf`, `nan` or element < 0 #56

RuntimeError: probability tensor contains either `inf`, `nan` or element < 0 #56

pagyyuan commented Sep 8, 2023

ywancit commented Sep 8, 2023

bihui9968 commented Sep 8, 2023

ChiQiuHong commented Sep 19, 2023

pagyyuan commented Sep 20, 2023

pagyyuan commented Sep 20, 2023

baolixiong commented Oct 23, 2023

qiu404 commented Nov 24, 2023

RuntimeError: probability tensor contains either inf, nan or element < 0 #56

RuntimeError: probability tensor contains either inf, nan or element < 0 #56

Comments

pagyyuan commented Sep 8, 2023

ywancit commented Sep 8, 2023

bihui9968 commented Sep 8, 2023

ChiQiuHong commented Sep 19, 2023

pagyyuan commented Sep 20, 2023

pagyyuan commented Sep 20, 2023

baolixiong commented Oct 23, 2023

qiu404 commented Nov 24, 2023

RuntimeError: probability tensor contains either `inf`, `nan` or element < 0 #56

RuntimeError: probability tensor contains either `inf`, `nan` or element < 0 #56