fix: multi gpu gpt not stop generate when end_id #487

xbugliu · 2023-03-10T15:59:41Z

With the Multi GPU GPT model, the generate is not stopped immediately when the end_id is hit, and it does not stop until output_seq_len.

byshiue · 2023-03-13T00:57:40Z

Hi, xbugliu. Thank you for the feedback. We are fixing the issue now and the reason is not only here. We will update the fixing ASAP.

hongqing1986 · 2023-03-16T06:45:20Z

t

With One Gpu, GPT model also not stopped immediately when the end_id is hit
Merging is one the way, good job!

byshiue · 2023-03-24T10:04:00Z

Hi. This fix would lead to hang on on pipeline parallelism, so we cannot merge it into main branch.
Temporarily fixing this issue in this branch https://github.com/NVIDIA/FasterTransformer/tree/tmp/fix_gpt_earlystop.

xbugliu · 2023-03-25T04:53:43Z

ok

Lzhang-hub · 2023-04-27T02:10:19Z

t

With One Gpu, GPT model also not stopped immediately when the end_id is hit Merging is one the way, good job!

@hongqing1986 I deploy a BLOOM model with one GPU, it not stopped immediately when the end_id is hit, and I build the image triton_with_ft with https://github.com/xbugliu/FasterTransformer.git, but it still not work well,
I would really appreciate it if you could describe how you use.

byshiue · 2023-04-27T02:13:56Z

Do you use the branch https://github.com/NVIDIA/FasterTransformer/tree/tmp/fix_gpt_earlystop?

Lzhang-hub · 2023-04-27T02:19:56Z

Yes, I also tried this branch.

I changed the Fastertransformer rep in the file CMakelists.txt in https://github.com/triton-inference-server/fastertransformer_backend rep, like :

and build the triton_with_ft image whith

docker build --rm   \
    --build-arg TRITON_VERSION=${CONTAINER_VERSION}   \
    -t ${TRITON_DOCKER_IMAGE} \
    -f docker/Dockerfile \
    .

byshiue · 2023-04-27T02:23:35Z

Yes, I also tried this branch.

I changed the Fastertransformer rep in the file CMakelists.txt in https://github.com/triton-inference-server/fastertransformer_backend rep, like :

and build the triton_with_ft image whith
docker build --rm   \
    --build-arg TRITON_VERSION=${CONTAINER_VERSION}   \
    -t ${TRITON_DOCKER_IMAGE} \
    -f docker/Dockerfile \
    .

Let's focus on FT's c example first. Can you share how to reproduce the issue on FT c example? Or you don't encounter such issue on FT, but only encounter on backend?

Lzhang-hub · 2023-04-27T02:28:16Z

Thank you very much, at present, I encountered in the issue of using the backend, I have not tried FT c example, I will try to the FT c examplenext, and provide feedback.

byshiue · 2023-05-01T08:47:17Z

The issue is fixed in the MR #584 and merge in main branch. Sorry for the late fixing.

fix: multi gpu gpt not stop generate when end_id

a929b16

Merge branch 'NVIDIA:main' into main

1853d5d

byshiue closed this Mar 24, 2023

This was referenced Mar 28, 2023

Stop the generation if the eod is reached #526

Open

sampling doesn't stop after words in stop_words_list are generated #528

Open

early stopping invalid #535

Closed

byshiue mentioned this pull request Apr 24, 2023

A GPT model based on the triton-with-ft always generates a sequence with a length of request_max_output_len instead of ending generation with the eos_id. #577

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: multi gpu gpt not stop generate when end_id #487

fix: multi gpu gpt not stop generate when end_id #487

xbugliu commented Mar 10, 2023

byshiue commented Mar 13, 2023

hongqing1986 commented Mar 16, 2023 •

edited

Loading

byshiue commented Mar 24, 2023

xbugliu commented Mar 25, 2023

Lzhang-hub commented Apr 27, 2023

byshiue commented Apr 27, 2023

Lzhang-hub commented Apr 27, 2023

byshiue commented Apr 27, 2023

Lzhang-hub commented Apr 27, 2023

byshiue commented May 1, 2023

fix: multi gpu gpt not stop generate when end_id #487

fix: multi gpu gpt not stop generate when end_id #487

Conversation

xbugliu commented Mar 10, 2023

byshiue commented Mar 13, 2023

hongqing1986 commented Mar 16, 2023 • edited Loading

byshiue commented Mar 24, 2023

xbugliu commented Mar 25, 2023

Lzhang-hub commented Apr 27, 2023

byshiue commented Apr 27, 2023

Lzhang-hub commented Apr 27, 2023

byshiue commented Apr 27, 2023

Lzhang-hub commented Apr 27, 2023

byshiue commented May 1, 2023

hongqing1986 commented Mar 16, 2023 •

edited

Loading