Add vLLM e2e tests #117

dsikka · 2024-08-26T21:39:57Z

Summary

Add e2e test with vLLM
Adds test cases for fp8, int8, w4a16, w8a16

Testing

Tested on llm-compressor-testing: https://github.com/neuralmagic/llm-compressor-testing/actions/runs/10568408144/workflow

robertgshaw2-neuralmagic · 2024-08-27T01:42:50Z

tests/e2e/vLLM/test_vllm.py

+        # Run vLLM with saved model
+        print("================= RUNNING vLLM =========================")
+        sampling_params = SamplingParams(temperature=0.80, top_p=0.95)
+        llm = LLM(model=self.save_dir)


having a test for tp>1 is also a good idea if we can

Yah I think that'll be a follow-up test since the structure will change a bit to deal with tp>1 with the same process

I do think that's more of a vLLM test. If anything, we could extend this to publish test models which are then pulled down for all vllm tests.

tests/e2e/vLLM/test_vllm.py

kylesayrs · 2024-08-27T01:55:07Z

tests/e2e/vLLM/test_vllm.py

+        llm = LLM(model=self.save_dir)
+        outputs = llm.generate(self.prompts, sampling_params)
+        print("================= vLLM GENERATION ======================")
+        print(outputs)


Logic for perplexity tests can be borrowed from
https://github.com/vllm-project/llm-compressor/blob/main/tests/llmcompressor/transformers/compression/test_quantization.py

I would suggest running gsm8k on 200 samples

Satrat

Looks great and covers all the main cases I can think of! Just had one note on validating output

tests/e2e/vLLM/test_vllm.py

* add first test * update tests * update to use config files * update test * update to add int8 tests * update * fix condition * fix typo * add w8a16 * update * update to clear session and delete dirs * conditional import for vllm * update * update num samples * add more test cases; add custom recipe support * update model * updat recipe modifier * Update fp8_weight_only.yaml * add more test cases * try a larger model * revert * add description; save model to hub post testing

dsikka added 13 commits August 26, 2024 21:36

add first test

599afa2

update tests

1661f7c

update to use config files

d9d687b

update test

c8dc2a9

update to add int8 tests

79281f5

update

f49f9b4

fix condition

ff01253

fix typo

b55d5e7

add w8a16

7e247b5

update

09bd1ed

update to clear session and delete dirs

2c6beb0

conditional import for vllm

6a94670

update

21fc505

dsikka requested a review from Satrat August 27, 2024 01:34

robertgshaw2-neuralmagic reviewed Aug 27, 2024

View reviewed changes

tests/e2e/vLLM/test_vllm.py Outdated Show resolved Hide resolved

kylesayrs approved these changes Aug 27, 2024

View reviewed changes

dsikka added 5 commits August 27, 2024 16:00

update num samples

5820a4b

add more test cases; add custom recipe support

2357493

update model

98346eb

updat recipe modifier

1a078b6

Update fp8_weight_only.yaml

af946f2

Satrat reviewed Aug 27, 2024

View reviewed changes

tests/e2e/vLLM/test_vllm.py Show resolved Hide resolved

dsikka added 3 commits August 28, 2024 02:52

add more test cases

0766c1a

try a larger model

45f99c2

revert

0abaf11

Satrat approved these changes Aug 28, 2024

View reviewed changes

add description; save model to hub post testing

d6625dd

dsikka merged commit 8e43aaa into main Aug 28, 2024
4 of 7 checks passed

dsikka deleted the e2e_tests branch August 28, 2024 19:29

markmc pushed a commit to markmc/llm-compressor that referenced this pull request Nov 13, 2024

Make publish workflow manually triggerable (vllm-project#117)

8613a47

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add vLLM e2e tests #117

Add vLLM e2e tests #117

dsikka commented Aug 26, 2024 •

edited

Loading

robertgshaw2-neuralmagic Aug 27, 2024

dsikka Aug 28, 2024

kylesayrs Aug 27, 2024

robertgshaw2-neuralmagic Aug 27, 2024

Satrat left a comment

Add vLLM e2e tests #117

Add vLLM e2e tests #117

Conversation

dsikka commented Aug 26, 2024 • edited Loading

Summary

Testing

robertgshaw2-neuralmagic Aug 27, 2024

Choose a reason for hiding this comment

dsikka Aug 28, 2024

Choose a reason for hiding this comment

kylesayrs Aug 27, 2024

Choose a reason for hiding this comment

robertgshaw2-neuralmagic Aug 27, 2024

Choose a reason for hiding this comment

Satrat left a comment

Choose a reason for hiding this comment

dsikka commented Aug 26, 2024 •

edited

Loading