Skip to content

Commit b3c345f

Browse files
authored
Add block quantization e2e test (#1867)
SUMMARY: Added e2e testing for block quantization. TEST PLAN: Tested locally with the following command: ``` python -m pytest tests/e2e/vLLM/test_vllm.py -vv -s ``` log: ``` ================= vLLM GENERATION ================= PROMPT: The capital of France is GENERATED TEXT: Paris, which is located in the Île-de-France region. The PROMPT: The president of the US is GENERATED TEXT: paying for the protests against him. The White House has reportedly cut PROMPT: My name is GENERATED TEXT: [insert name], and I am a [insert job title]. I am excited PASSED ===================================================================================================================== 1 passed in 130.10s (0:02:10) ===================================================================================================================== ``` --------- Signed-off-by: shanjiaz <zsjwpianpian@gmail.com>
1 parent 4cfc0e6 commit b3c345f

File tree

1 file changed

+4
-0
lines changed

1 file changed

+4
-0
lines changed
Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,4 @@
1+
cadence: "nightly"
2+
test_type: "regression"
3+
model: TinyLlama/TinyLlama-1.1B-Chat-v1.0
4+
scheme: FP8_BLOCK

0 commit comments

Comments
 (0)