[Bug]: TypeError: '_PlaceholderModuleAttr' object is not callable for RunAI SafetensorsStreamer() #11858
Closed
1 task done
Labels
bug
Something isn't working
Your current environment
The output of `python collect_env.py`
Model Input Dumps
No response
🐛 Describe the bug
Previous S3 pickling bug encountered: #11819
The Pickling issue was fixed after applying the change in this PR: #11825
However, when loading into multiple GPUs again, a new error has surfaced.
This is the command used:
vllm serve s3://llama-3.3-70b-instruct --load-format runai_streamer --max-num-seqs 8 --tensor-parallel-size 4
Before submitting a new issue...
The text was updated successfully, but these errors were encountered: