Support string-based stopping conditions #92

WoosukKwon · 2023-05-10T08:30:27Z

No description provided.

SUMMARY: Trigger minimal benchmarking on remote-push jobs. TEST PLAN: Jobs on this PR Co-authored-by: Varun Sundar Rabindranath <varun@neuralmagic.com>

* Reading the shapes csv only once and writing only if a new shape is deicovered * fix lint --------- Co-authored-by: Gregory Shtrasberg <Gregory.Shtrasberg@amd.com>

* Cleanup AttentionMetadata on HPU * Flat PA - POC * Decode warmup overhaul * Debugging OOM * Experimental profiling * Fix input_hash calculation * Block bucket size 32 -> 16 * Improve host time * Skip UTs * Add GQA/MQA * Add mask instead of filling * 2d block mapping * Optional flipping in PA * Runner updated for 2d block mapping * Restore mark_step * Eliminate physical transposes * Disable warmup_mode * Revert changes to test_attention.py * POC: build block_bias on device * Cleanup * Fix seq_len calculation * Experimental profiling * Add missing call to kv_matmul_op * Fix block_usage calculation * Change default block bucket step for decode to 128 * Fix max decode block bucket calculation * Fix block_usage calculations * Cleanup * Cleanup profiler code * Print values for bucketing vars * Pass block size do HpuModelAdapter --------- Co-authored-by: barak goldberg <149692267+bgoldberg-habana@users.noreply.github.com>

WoosukKwon self-assigned this May 10, 2023

WoosukKwon changed the title ~~Support stopping conditions with multiple tokens.~~ Support string-based stopping conditions May 10, 2023

WoosukKwon added the P1 label May 10, 2023

WoosukKwon added P0 and removed P1 labels May 17, 2023

WoosukKwon mentioned this issue May 21, 2023

Implement stop strings and best_of #114

Merged

WoosukKwon closed this as completed in #114 May 21, 2023

yukavio pushed a commit to yukavio/vllm that referenced this issue Jul 3, 2024

Benchmarking : Remote push job (vllm-project#92)

fa8e147

SUMMARY: Trigger minimal benchmarking on remote-push jobs. TEST PLAN: Jobs on this PR Co-authored-by: Varun Sundar Rabindranath <varun@neuralmagic.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support string-based stopping conditions #92

Support string-based stopping conditions #92

WoosukKwon commented May 10, 2023

Support string-based stopping conditions #92

Support string-based stopping conditions #92

Comments

WoosukKwon commented May 10, 2023