Wrong prediction from "bloom-deepspeed-inference-int8" #10

zomux · 2022-09-19T15:48:06Z

I'm running bloom-deepspeed-inference-int8 using the following command on 8 x 40G A100 machine.

deepspeed --num_gpus 8 xxx.py --name microsoft/bloom-deepspeed-inference-int8 --dtype int8 --batch_size 8

I got the generation result, but they have a lot of repetition which is not the case for accelerate-based bloom int8 implementation.

Generate args {'max_new_tokens': 100, 'do_sample': False}
------------------------------------------------------------
in=DeepSpeed is a machine learning framework
out=DeepSpeed is a machine learning framework for deep deep deep deep deep deep deep deep deep deep deep deep deep deep deep deep deep deep deep deep deep deep deep deep deep deep deep deep deep deep deep deep deep deep de
ep deep deep deep deep deep deep deep deep deep deep deep deep deep deep deep deep deep deep deep deep deep deep deep deep deep deep deep deep deep deep deep deep deep deep deep deep deep deep deep deep deep deep deep deep
 deep deep deep deep deep deep deep deep deep deep deep deep deep deep deep deep deep deep deep deep

------------------------------------------------------------
in=He is working on
out=He is working on a new album, and is also working on a new album with his band, and is also working on a new album with his band, and is also working on a new album with his band, and is working on a new album, and is
working on a new album,

------------------------------------------------------------
in=He has a
out=He has a lot of money.
He has a lot of money.
He has a lot of money.
He has a lot of money.
He has a lot of money.
He has a

------------------------------------------------------------
in=He got all
out=He got all the way to the top of the mountain, and he was so very very very very very very very very very very very very very very

------------------------------------------------------------
in=Everyone is happy and I can
out=Everyone is happy and I can see that. I am happy too. I am happy too. I am happy too.

------------------------------------------------------------
in=The new movie that got Oscar this year                                                                                                                                                                                     out=The new movie that got Oscar this year is a movie about a movie about a movie about a movie about

------------------------------------------------------------
in=In the far far distance from our galaxy,
out=In the far far distance from our galaxy, there is a a a a a a a a a galaxy

------------------------------------------------------------
in=Peace is the only way
out=Peace is the only way to live. We must be peaceful and live in

The text was updated successfully, but these errors were encountered:

zomux · 2022-09-19T16:22:11Z

Also got error "probability tensor contains either inf, nan or element < 0" with a longer prompt

mayank31398 · 2022-09-19T18:01:58Z

@zomux can you try updating to the latest deepspeed (0.7.3)?
microsoft/DeepSpeed#2217 (comment)
This issue was mentioned before and is now fixed.

zomux · 2022-09-19T18:04:46Z

@mayank31398 I'm running ds with the latest github checkout

➜  pip list | grep deepspeed
deepspeed                     0.7.3+15923810

Thanks for the pointer, I will check the discussion there.

zomux · 2022-09-20T16:24:36Z

I'm also getting the "CUDA error: an illegal memory access was encountered" error with little bit longer prompt , same as microsoft/DeepSpeed#2217 (comment)

Is it possible that the checkpoints in https://huggingface.co/microsoft/bloom-deepspeed-inference-int8/tree/main are produced before that fix was merged?

mayank31398 · 2022-09-20T17:02:13Z

The checkpoints dont have anything to do with this

mayank31398 · 2022-09-20T17:07:10Z

Try using this branch
ds-inference/support-large-token-length in deepspeed
This is still WIP

zomux · 2022-09-20T17:08:29Z

Awesome thanks, gonna check it. Let me know if you want more details for reproducing this problem.

zomux · 2022-09-20T21:50:54Z

@mayank31398 Thanks for the pointers. I think my issue is solved after putting different things together. Thanks!

zomux · 2022-09-20T21:51:12Z

Resolving this issue.

zomux closed this as completed Sep 20, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Wrong prediction from "bloom-deepspeed-inference-int8" #10

Wrong prediction from "bloom-deepspeed-inference-int8" #10

zomux commented Sep 19, 2022

zomux commented Sep 19, 2022

mayank31398 commented Sep 19, 2022

zomux commented Sep 19, 2022

zomux commented Sep 20, 2022

mayank31398 commented Sep 20, 2022

mayank31398 commented Sep 20, 2022

zomux commented Sep 20, 2022

zomux commented Sep 20, 2022

zomux commented Sep 20, 2022

Wrong prediction from "bloom-deepspeed-inference-int8" #10

Wrong prediction from "bloom-deepspeed-inference-int8" #10

Comments

zomux commented Sep 19, 2022

zomux commented Sep 19, 2022

mayank31398 commented Sep 19, 2022

zomux commented Sep 19, 2022

zomux commented Sep 20, 2022

mayank31398 commented Sep 20, 2022

mayank31398 commented Sep 20, 2022

zomux commented Sep 20, 2022

zomux commented Sep 20, 2022

zomux commented Sep 20, 2022