Skip to content

Conversation

@pwschuurman
Copy link
Contributor

@pwschuurman pwschuurman commented Sep 15, 2025

Purpose

  • Install the runai-model-streamer[gcs] pip package by default, to enable GCS support in the nightly / published image.

Test Plan / Test Result

Validated locally by building docker image and running locally (see updated documentation).

@mergify mergify bot added documentation Improvements or additions to documentation ci/build rocm Related to AMD ROCm labels Sep 15, 2025
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request updates the runai-model-streamer dependency to version 0.14.0 and adds runai-model-streamer-gcs to enable loading models from Google Cloud Storage. The changes in the requirements files and documentation are consistent with this goal. I've found one high-severity issue related to packaging: the optional dependency group [runai] likely needs to be updated to include the new GCS package to make the feature available to users. Please see the detailed comment.

@pwschuurman
Copy link
Contributor Author

I've found one high-severity issue related to packaging: the optional dependency group [runai] likely needs to be updated to include the new GCS package to make the feature available to users. Please see the detailed comment.

Package group was updated in #23845

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I use this version runai-model-streamer==0.14.0 cannot download model from s3, I'm not sure if it's my environment or what.

ERROR 09-16 02:53:54 [v1/engine/core.py:712] Exception: Could not receive runai_response from libstreamer due to: b'File access error'
(EngineCore_DP0 pid=375431) Process EngineCore_DP0:
(EngineCore_DP0 pid=375431) Traceback (most recent call last):
(EngineCore_DP0 pid=375431)   File "/usr/lib/python3.12/multiprocessing/process.py", line 314, in _bootstrap
(EngineCore_DP0 pid=375431)     self.run()
(EngineCore_DP0 pid=375431)   File "/usr/lib/python3.12/multiprocessing/process.py", line 108, in run
(EngineCore_DP0 pid=375431)     self._target(*self._args, **self._kwargs)
(EngineCore_DP0 pid=375431)   File "/root/code/vllm/vllm/v1/engine/core.py", line 716, in run_engine_core
(EngineCore_DP0 pid=375431)     raise e
(EngineCore_DP0 pid=375431)   File "/root/code/vllm/vllm/v1/engine/core.py", line 703, in run_engine_core
(EngineCore_DP0 pid=375431)     engine_core = EngineCoreProc(*args, **kwargs)
(EngineCore_DP0 pid=375431)                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=375431)   File "/root/code/vllm/vllm/v1/engine/core.py", line 502, in __init__
(EngineCore_DP0 pid=375431)     super().__init__(vllm_config, executor_class, log_stats,
(EngineCore_DP0 pid=375431)   File "/root/code/vllm/vllm/v1/engine/core.py", line 81, in __init__
(EngineCore_DP0 pid=375431)     self.model_executor = executor_class(vllm_config)
(EngineCore_DP0 pid=375431)                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=375431)   File "/root/code/vllm/vllm/executor/executor_base.py", line 55, in __init__
(EngineCore_DP0 pid=375431)     self._init_executor()
(EngineCore_DP0 pid=375431)   File "/root/code/vllm/vllm/executor/uniproc_executor.py", line 55, in _init_executor
(EngineCore_DP0 pid=375431)     self.collective_rpc("load_model")
(EngineCore_DP0 pid=375431)   File "/root/code/vllm/vllm/executor/uniproc_executor.py", line 83, in collective_rpc
(EngineCore_DP0 pid=375431)     return [run_method(self.driver_worker, method, args, kwargs)]
(EngineCore_DP0 pid=375431)             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=375431)   File "/root/code/vllm/vllm/utils/__init__.py", line 3067, in run_method
(EngineCore_DP0 pid=375431)     return func(*args, **kwargs)
(EngineCore_DP0 pid=375431)            ^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=375431)   File "/root/code/vllm/vllm/v1/worker/gpu_worker.py", line 214, in load_model
(EngineCore_DP0 pid=375431)     self.model_runner.load_model(eep_scale_up=eep_scale_up)
(EngineCore_DP0 pid=375431)   File "/root/code/vllm/vllm/v1/worker/gpu_model_runner.py", line 2390, in load_model
(EngineCore_DP0 pid=375431)     self.model = model_loader.load_model(
(EngineCore_DP0 pid=375431)                  ^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=375431)   File "/root/code/vllm/vllm/model_executor/model_loader/base_loader.py", line 50, in load_model
(EngineCore_DP0 pid=375431)     self.load_weights(model, model_config)
(EngineCore_DP0 pid=375431)   File "/root/code/vllm/vllm/model_executor/model_loader/runai_streamer_loader.py", line 103, in load_weights
(EngineCore_DP0 pid=375431)     model.load_weights(
(EngineCore_DP0 pid=375431)   File "/root/code/vllm/vllm/model_executor/models/qwen3.py", line 344, in load_weights
(EngineCore_DP0 pid=375431)     return loader.load_weights(weights)
(EngineCore_DP0 pid=375431)            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=375431)   File "/root/code/vllm/vllm/model_executor/models/utils.py", line 291, in load_weights
(EngineCore_DP0 pid=375431)     autoloaded_weights = set(self._load_module("", self.module, weights))
(EngineCore_DP0 pid=375431)                          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=375431)   File "/root/code/vllm/vllm/model_executor/models/utils.py", line 249, in _load_module
(EngineCore_DP0 pid=375431)     yield from self._load_module(prefix,
(EngineCore_DP0 pid=375431)   File "/root/code/vllm/vllm/model_executor/models/utils.py", line 222, in _load_module
(EngineCore_DP0 pid=375431)     loaded_params = module_load_weights(weights)
(EngineCore_DP0 pid=375431)                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=375431)   File "/root/code/vllm/vllm/model_executor/models/qwen2.py", line 392, in load_weights
(EngineCore_DP0 pid=375431)     for name, loaded_weight in weights:
(EngineCore_DP0 pid=375431)                                ^^^^^^^
(EngineCore_DP0 pid=375431)   File "/root/code/vllm/vllm/model_executor/models/utils.py", line 136, in <genexpr>
(EngineCore_DP0 pid=375431)     for parts, weights_data in group),
(EngineCore_DP0 pid=375431)                                ^^^^^
(EngineCore_DP0 pid=375431)   File "/root/code/vllm/vllm/model_executor/models/utils.py", line 127, in <genexpr>
(EngineCore_DP0 pid=375431)     for weight_name, weight_data in weights)
(EngineCore_DP0 pid=375431)                                     ^^^^^^^
(EngineCore_DP0 pid=375431)   File "/root/code/vllm/vllm/model_executor/models/utils.py", line 288, in <genexpr>
(EngineCore_DP0 pid=375431)     weights = ((name, weight) for name, weight in weights
(EngineCore_DP0 pid=375431)                                                   ^^^^^^^
(EngineCore_DP0 pid=375431)   File "/root/code/vllm/vllm/model_executor/model_loader/weight_utils.py", line 595, in runai_safetensors_weights_iterator
(EngineCore_DP0 pid=375431)     yield from tensor_iter
(EngineCore_DP0 pid=375431)   File "/usr/local/lib/python3.12/dist-packages/tqdm/std.py", line 1181, in __iter__
(EngineCore_DP0 pid=375431)     for obj in iterable:
(EngineCore_DP0 pid=375431)                ^^^^^^^^
(EngineCore_DP0 pid=375431)   File "/usr/local/lib/python3.12/dist-packages/runai_model_streamer/safetensors_streamer/safetensors_streamer.py", line 84, in get_tensors
(EngineCore_DP0 pid=375431)     for file_path, ready_chunk_index, buffer in self.file_streamer.get_chunks():
(EngineCore_DP0 pid=375431)                                                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=375431)   File "/usr/local/lib/python3.12/dist-packages/runai_model_streamer/file_streamer/file_streamer.py", line 116, in get_chunks
(EngineCore_DP0 pid=375431)     yield from self.request_ready_chunks()
(EngineCore_DP0 pid=375431)   File "/usr/local/lib/python3.12/dist-packages/runai_model_streamer/file_streamer/file_streamer.py", line 137, in request_ready_chunks
(EngineCore_DP0 pid=375431)     file_relative_index, chunk_relative_index = runai_response(self.streamer)
(EngineCore_DP0 pid=375431)                                                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=375431)   File "/usr/local/lib/python3.12/dist-packages/runai_model_streamer/libstreamer/libstreamer.py", line 93, in runai_response
(EngineCore_DP0 pid=375431)     raise Exception(
(EngineCore_DP0 pid=375431) Exception: Could not receive runai_response from libstreamer due to: b'File access error'

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you provide more details on how you encountered this error (eg: command that you ran that resulted in this)? And did the same command work successfully with an older version of the model streamer? From the error, this looks like an authentication/permission problem.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I use runai-model-streamer==0.13.0 this version is can running vllm, but upgrade version to 0.14.0 will have this error.

I can access minio, like this describe #23845 (comment), Not sure if it is related to this bug run-ai/runai-model-streamer#81

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make sure you upgrade runai-model-streamer-s3 to version 0.14.0 as well

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I didn't make it clear.
I use runai-model-streamer==0.13.0+ runai-model-streamer-s3==0.14.0 version can success running. but use runai-model-streamer==0.14.0+ runai-model-streamer-s3==0.14.0 will have a error "b'File access error'"

@mergify
Copy link

mergify bot commented Sep 16, 2025

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @pwschuurman.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

@hmellor
Copy link
Member

hmellor commented Sep 18, 2025

@22quinn could you please give this a review? Thanks!

@hmellor hmellor requested a review from 22quinn September 18, 2025 10:29
Copy link
Collaborator

@22quinn 22quinn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry missed this. LGTM.
Seems this test is currently not running in CI? Can we add it in: https://github.com/vllm-project/vllm/blob/main/.buildkite/test-pipeline.yaml

Separately, we should consolidate all model loader related tests.

@hmellor hmellor added the ready ONLY add when PR is ready to merge/full CI is needed label Sep 26, 2025
@pwschuurman pwschuurman changed the title Runai version update Run:ai model streamer add GCS package support Sep 29, 2025
@pwschuurman
Copy link
Contributor Author

@DarkLight1337 I rebased this PR with your changes from #25765. Thanks for simplifying and wiring up the tests!

@pwschuurman pwschuurman force-pushed the runai-version-update branch 2 times, most recently from d2d6e87 to af4cedf Compare September 30, 2025 16:54
Signed-off-by: Peter Schuurman <psch@google.com>
…e Cloud Storage

Signed-off-by: Peter Schuurman <psch@google.com>
Signed-off-by: Peter Schuurman <psch@google.com>
@pwschuurman pwschuurman force-pushed the runai-version-update branch from af4cedf to b1cb285 Compare October 1, 2025 15:39
@22quinn 22quinn merged commit be22bb6 into vllm-project:main Oct 2, 2025
7 checks passed
pdasigi pushed a commit to pdasigi/vllm that referenced this pull request Oct 2, 2025
Signed-off-by: Peter Schuurman <psch@google.com>
yewentao256 pushed a commit that referenced this pull request Oct 3, 2025
Signed-off-by: Peter Schuurman <psch@google.com>
Signed-off-by: yewentao256 <zhyanwentao@126.com>
tomeras91 pushed a commit to tomeras91/vllm that referenced this pull request Oct 6, 2025
Signed-off-by: Peter Schuurman <psch@google.com>
Signed-off-by: Tomer Asida <57313761+tomeras91@users.noreply.github.com>
southfreebird pushed a commit to southfreebird/vllm that referenced this pull request Oct 7, 2025
Signed-off-by: Peter Schuurman <psch@google.com>
xuebwang-amd pushed a commit to xuebwang-amd/vllm that referenced this pull request Oct 10, 2025
Signed-off-by: Peter Schuurman <psch@google.com>
Signed-off-by: xuebwang-amd <xuebwang@amd.com>
lywa1998 pushed a commit to lywa1998/vllm that referenced this pull request Oct 20, 2025
Signed-off-by: Peter Schuurman <psch@google.com>
alhridoy pushed a commit to alhridoy/vllm that referenced this pull request Oct 24, 2025
Signed-off-by: Peter Schuurman <psch@google.com>
xuebwang-amd pushed a commit to xuebwang-amd/vllm that referenced this pull request Oct 24, 2025
Signed-off-by: Peter Schuurman <psch@google.com>
Signed-off-by: xuebwang-amd <xuebwang@amd.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ci/build documentation Improvements or additions to documentation ready ONLY add when PR is ready to merge/full CI is needed rocm Related to AMD ROCm

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants