-
Notifications
You must be signed in to change notification settings - Fork 7k
Closed
Labels
P0Issues that should be fixed in short orderIssues that should be fixed in short orderbugSomething that is supposed to be working; but isn'tSomething that is supposed to be working; but isn'tcommunity-backlogcoreIssues that should be addressed in Ray CoreIssues that should be addressed in Ray Core
Description
What happened + What you expected to happen
It shouldn't check fail
Versions / Dependencies
master
Reproduction script
ray job submit -- python -m vllm.entrypoints.openai.api_server --model meta-llama/Meta-Llama-3.1-8B-Instruct --enable-chunked-prefill --max-num-batched-tokens 2048 --max-num-seqs 64 --tokenizer-pool-size 2 --trust-remote-code --tensor-parallel-size 1 --max-model-len 8192
Issue Severity
None
Metadata
Metadata
Assignees
Labels
P0Issues that should be fixed in short orderIssues that should be fixed in short orderbugSomething that is supposed to be working; but isn'tSomething that is supposed to be working; but isn'tcommunity-backlogcoreIssues that should be addressed in Ray CoreIssues that should be addressed in Ray Core