[Bug]: flakey test found in #7874 #8051

noooop · 2024-08-31T03:13:28Z

#7874 adjusted chunked prefill the scheduling order

fp8_e4m3 model FAILED

FAILED basic_correctness/test_chunked_prefill.py::test_models_with_fp8_kv_cache[True-1-False-4-4-fp8_e4m3-nm-testing/Qwen2-1.5B-Instruct-FP8-K-V] - AssertionError: Test7:

FAILED basic_correctness/test_chunked_prefill.py::test_models_with_fp8_kv_cache[True-1-True-4-4-fp8_e4m3-nm-testing/Qwen2-1.5B-Instruct-FP8-K-V] - AssertionError: Test7:

but bf16 model PASS

tests/basic_correctness/test_chunked_prefill.py::test_models_with_fp8_kv_cache[True-1-False-4-4-auto-Qwen/Qwen2-1.5B-Instruct]
tests/basic_correctness/test_chunked_prefill.py::test_models_with_fp8_kv_cache[True-1-True-4-4-auto-Qwen/Qwen2-1.5B-Instruct]

Test 7 is stuck on the resolution of fp8_e4m3，

🐛 Describe the bug

as @jon-chuang said:

The test using top k log probs may have been bound to be flakey. Perhaps testing style like this is more reliable especially given hardware differences or drift across kernels. #8013

(However note that this is just top 1 log probs I.e. greedy decode so not sure if even this testing strategy will be reliable)

Before submitting a new issue...

Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.

jon-chuang · 2024-09-03T05:10:06Z

Will try setting higher NUM_LOGPROBS for fp8e4m3 as the test is simply whether sampled for one is in top K log probs of reference and vice versa

noooop added the bug Something isn't working label Aug 31, 2024

noooop mentioned this issue Aug 31, 2024

[Bugfix] Fix #7592 vllm 0.5.4 enable_chunked_prefill throughput is slightly lower than 0.5.3~0.5.0. #7874

Merged

noooop added a commit to noooop/vllm that referenced this issue Aug 31, 2024

flakey test, see: vllm-project#7874 vllm-project#8051

95aab1c

noooop added a commit to noooop/vllm that referenced this issue Aug 31, 2024

flakey test, see: vllm-project#7874 vllm-project#8051

57dc722

noooop added a commit to noooop/vllm that referenced this issue Aug 31, 2024

flakey test, see: vllm-project#7874 vllm-project#8051

a05dd0b

noooop added a commit to noooop/vllm that referenced this issue Aug 31, 2024

flakey test, see: vllm-project#7874 vllm-project#8051

ad5f1db

jon-chuang mentioned this issue Sep 3, 2024

[MISC] Consolidate FP8 kv-cache tests #8131

Merged

noooop closed this as completed Sep 6, 2024

noooop mentioned this issue Oct 22, 2024

[Bug]: Models produce different output with different batch sizes #9567

Closed

1 task

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug]: flakey test found in #7874 #8051

[Bug]: flakey test found in #7874 #8051

noooop commented Aug 31, 2024 •

edited

Loading

jon-chuang commented Sep 3, 2024

[Bug]: flakey test found in #7874 #8051

[Bug]: flakey test found in #7874 #8051

Comments

noooop commented Aug 31, 2024 • edited Loading

🐛 Describe the bug

Before submitting a new issue...

jon-chuang commented Sep 3, 2024

noooop commented Aug 31, 2024 •

edited

Loading