Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

T5 enc/dec example file; linting/formatting #1

Merged
merged 16 commits into from
Mar 2, 2024

Conversation

afeldman-nm
Copy link

SUMMARY
This PR is mainly to set up a process whereby I may PR from my vLLM fork into your fork. This PR also (1) accomplishes linting and formatting using ./format.sh and (2) moves your test.py file into examples/ and organizes it to look like other examples

TESTING
When this PR is finished, examples/offline_inference_enc_dec.py should compare T5 inference results between vLLM and native PyTorch execution, allowing the particular T5 variant and the datatype to be customized. This should help with validating correctness of the vLLM T5 integration as well as debugging the NaN case for T5-large.

@afeldman-nm
Copy link
Author

Hello, quick update: as of the last commit, the test at examples/offline_inference_enc_dec.py now yields a comparison between vLLM and native PyTorch completion results. The model size and dtype may be customized for a given run of the example script.

I ran the example script for the following test cases:

  • model = ['t5-small','t5-large']
  • dtype = ['float16','bfloat16','float32'] # There is no bfloat32

The results are below. In summary, vLLM T5 completions consistently match native PyTorch, except for the t5-large/float16 case where only vLLM yields NaNs that result in an empty completion. I am not concerned about this for now, as I suspect T5 was optimized for FP32.

t5-small, float16:

Prompt: 'Who are you?', Native PyTorch generated text: 'Wer bist du?', vLLM generated text: ' Wer bist du?'
Prompt: 'Who are you?', Native PyTorch generated text: 'Wer bist du?', vLLM generated text: ' Wer bist du?'
Prompt: 'How do you like your egg made', Native PyTorch generated text: 'Wie m aime o egg made', vLLM generated text: ' Wie m aime o egg made'
Prompt: 'How do you like your egg made', Native PyTorch generated text: 'Wie m aime o egg made', vLLM generated text: ' Wie m aime o egg made'

t5-small, bfloat16:

Prompt: 'Who are you?', Native PyTorch generated text: 'Wer bist du?', vLLM generated text: ' Wer bist du?'
Prompt: 'Who are you?', Native PyTorch generated text: 'Wer bist du?', vLLM generated text: ' Wer bist du?'
Prompt: 'How do you like your egg made', Native PyTorch generated text: 'Wie m aime o egg made', vLLM generated text: ' Wie m aime o egg made'
Prompt: 'How do you like your egg made', Native PyTorch generated text: 'Wie m aime o egg made', vLLM generated text: ' Wie m aime o egg made'

t5-small, float32:

Prompt: 'Who are you?', Native PyTorch generated text: 'Wer bist du?', vLLM generated text: ' Wer bist du?'
Prompt: 'Who are you?', Native PyTorch generated text: 'Wer bist du?', vLLM generated text: ' Wer bist du?'
Prompt: 'How do you like your egg made', Native PyTorch generated text: 'Wie m aime o egg made', vLLM generated text: ' Wie m aime o egg made'
Prompt: 'How do you like your egg made', Native PyTorch generated text: 'Wie m aime o egg made', vLLM generated text: ' Wie m aime o egg made'

t5-large, float16:

Note: for the vLLM tests in this scenario, intermediate results of the inference process became NaN, which led to the vLLM output being an empty string. This was not the case for the native PyTorch output.

Prompt: 'Who are you?', Native PyTorch generated text: 'Who are you?', vLLM generated text: ''
Prompt: 'Who are you?', Native PyTorch generated text: 'Who are you?', vLLM generated text: ''
Prompt: 'How do you like your egg made', Native PyTorch generated text: '? How do you like your egg made?', vLLM generated text: ''
Prompt: 'How do you like your egg made', Native PyTorch generated text: '? How do you like your egg made?', vLLM generated text: ''

t5-large, bfloat16:

Prompt: 'Who are you?', Native PyTorch generated text: 'Who are you?', vLLM generated text: ' Who are you?'
Prompt: 'Who are you?', Native PyTorch generated text: 'Who are you?', vLLM generated text: ' Who are you?'
Prompt: 'How do you like your egg made', Native PyTorch generated text: '? How do you like your egg made?', vLLM generated text: '? How do you like your egg made?'
Prompt: 'How do you like your egg made', Native PyTorch generated text: '? How do you like your egg made?', vLLM generated text: '? How do you like your egg made?'

t5-large, float32:

Prompt: 'Who are you?', Native PyTorch generated text: 'Who are you?', vLLM generated text: ' Who are you?'
Prompt: 'Who are you?', Native PyTorch generated text: 'Who are you?', vLLM generated text: ' Who are you?'
Prompt: 'How do you like your egg made', Native PyTorch generated text: '? How do you like your egg made?', vLLM generated text: '? How do you like your egg made?'
Prompt: 'How do you like your egg made', Native PyTorch generated text: '? How do you like your egg made?', vLLM generated text: '? How do you like your egg made?'

@afeldman-nm
Copy link
Author

Also, I just went in and resolved the remaining conflicts with vllm upstream main

@js8544
Copy link
Owner

js8544 commented Mar 2, 2024

+1 Thanks!

@js8544 js8544 merged commit db726e6 into js8544:enc_dec_t5 Mar 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.