Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add vLLM e2e tests #117

Merged
merged 22 commits into from
Aug 28, 2024
Merged

Add vLLM e2e tests #117

merged 22 commits into from
Aug 28, 2024

Conversation

dsikka
Copy link
Collaborator

@dsikka dsikka commented Aug 26, 2024

Summary

  • Add e2e test with vLLM
  • Adds test cases for fp8, int8, w4a16, w8a16

Testing

@dsikka dsikka requested a review from Satrat August 27, 2024 01:34
# Run vLLM with saved model
print("================= RUNNING vLLM =========================")
sampling_params = SamplingParams(temperature=0.80, top_p=0.95)
llm = LLM(model=self.save_dir)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

having a test for tp>1 is also a good idea if we can

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yah I think that'll be a follow-up test since the structure will change a bit to deal with tp>1 with the same process

I do think that's more of a vLLM test. If anything, we could extend this to publish test models which are then pulled down for all vllm tests.

llm = LLM(model=self.save_dir)
outputs = llm.generate(self.prompts, sampling_params)
print("================= vLLM GENERATION ======================")
print(outputs)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest running gsm8k on 200 samples

Copy link
Contributor

@Satrat Satrat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great and covers all the main cases I can think of! Just had one note on validating output

tests/e2e/vLLM/test_vllm.py Show resolved Hide resolved
@dsikka dsikka merged commit 8e43aaa into main Aug 28, 2024
4 of 7 checks passed
@dsikka dsikka deleted the e2e_tests branch August 28, 2024 19:29
kylesayrs pushed a commit that referenced this pull request Aug 28, 2024
* add first test

* update tests

* update to use config files

* update test

* update to add int8 tests

* update

* fix condition

* fix typo

* add w8a16

* update

* update to clear session and delete dirs

* conditional import for vllm

* update

* update num samples

* add more test cases; add custom recipe support

* update model

* updat recipe modifier

* Update fp8_weight_only.yaml

* add more test cases

* try a larger model

* revert

* add description; save model to hub post testing
markmc pushed a commit to markmc/llm-compressor that referenced this pull request Nov 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants