Run:ai model streamer add GCS package support #24909

pwschuurman · 2025-09-15T20:19:37Z

Purpose

Install the runai-model-streamer[gcs] pip package by default, to enable GCS support in the nightly / published image.

Test Plan / Test Result

Validated locally by building docker image and running locally (see updated documentation).

gemini-code-assist

Code Review

This pull request updates the runai-model-streamer dependency to version 0.14.0 and adds runai-model-streamer-gcs to enable loading models from Google Cloud Storage. The changes in the requirements files and documentation are consistent with this goal. I've found one high-severity issue related to packaging: the optional dependency group [runai] likely needs to be updated to include the new GCS package to make the feature available to users. Please see the detailed comment.

pwschuurman · 2025-09-15T20:44:39Z

I've found one high-severity issue related to packaging: the optional dependency group [runai] likely needs to be updated to include the new GCS package to make the feature available to users. Please see the detailed comment.

Package group was updated in #23845

requirements/nightly_torch_test.txt

docs/models/extensions/runai_model_streamer.md

lengrongfu · 2025-09-16T10:00:22Z

requirements/test.in

I use this version runai-model-streamer==0.14.0 cannot download model from s3, I'm not sure if it's my environment or what.

ERROR 09-16 02:53:54 [v1/engine/core.py:712] Exception: Could not receive runai_response from libstreamer due to: b'File access error' (EngineCore_DP0 pid=375431) Process EngineCore_DP0: (EngineCore_DP0 pid=375431) Traceback (most recent call last): (EngineCore_DP0 pid=375431) File "/usr/lib/python3.12/multiprocessing/process.py", line 314, in _bootstrap (EngineCore_DP0 pid=375431) self.run() (EngineCore_DP0 pid=375431) File "/usr/lib/python3.12/multiprocessing/process.py", line 108, in run (EngineCore_DP0 pid=375431) self._target(*self._args, **self._kwargs) (EngineCore_DP0 pid=375431) File "/root/code/vllm/vllm/v1/engine/core.py", line 716, in run_engine_core (EngineCore_DP0 pid=375431) raise e (EngineCore_DP0 pid=375431) File "/root/code/vllm/vllm/v1/engine/core.py", line 703, in run_engine_core (EngineCore_DP0 pid=375431) engine_core = EngineCoreProc(*args, **kwargs) (EngineCore_DP0 pid=375431) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (EngineCore_DP0 pid=375431) File "/root/code/vllm/vllm/v1/engine/core.py", line 502, in __init__ (EngineCore_DP0 pid=375431) super().__init__(vllm_config, executor_class, log_stats, (EngineCore_DP0 pid=375431) File "/root/code/vllm/vllm/v1/engine/core.py", line 81, in __init__ (EngineCore_DP0 pid=375431) self.model_executor = executor_class(vllm_config) (EngineCore_DP0 pid=375431) ^^^^^^^^^^^^^^^^^^^^^^^^^^^ (EngineCore_DP0 pid=375431) File "/root/code/vllm/vllm/executor/executor_base.py", line 55, in __init__ (EngineCore_DP0 pid=375431) self._init_executor() (EngineCore_DP0 pid=375431) File "/root/code/vllm/vllm/executor/uniproc_executor.py", line 55, in _init_executor (EngineCore_DP0 pid=375431) self.collective_rpc("load_model") (EngineCore_DP0 pid=375431) File "/root/code/vllm/vllm/executor/uniproc_executor.py", line 83, in collective_rpc (EngineCore_DP0 pid=375431) return [run_method(self.driver_worker, method, args, kwargs)] (EngineCore_DP0 pid=375431) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (EngineCore_DP0 pid=375431) File "/root/code/vllm/vllm/utils/__init__.py", line 3067, in run_method (EngineCore_DP0 pid=375431) return func(*args, **kwargs) (EngineCore_DP0 pid=375431) ^^^^^^^^^^^^^^^^^^^^^ (EngineCore_DP0 pid=375431) File "/root/code/vllm/vllm/v1/worker/gpu_worker.py", line 214, in load_model (EngineCore_DP0 pid=375431) self.model_runner.load_model(eep_scale_up=eep_scale_up) (EngineCore_DP0 pid=375431) File "/root/code/vllm/vllm/v1/worker/gpu_model_runner.py", line 2390, in load_model (EngineCore_DP0 pid=375431) self.model = model_loader.load_model( (EngineCore_DP0 pid=375431) ^^^^^^^^^^^^^^^^^^^^^^^^ (EngineCore_DP0 pid=375431) File "/root/code/vllm/vllm/model_executor/model_loader/base_loader.py", line 50, in load_model (EngineCore_DP0 pid=375431) self.load_weights(model, model_config) (EngineCore_DP0 pid=375431) File "/root/code/vllm/vllm/model_executor/model_loader/runai_streamer_loader.py", line 103, in load_weights (EngineCore_DP0 pid=375431) model.load_weights( (EngineCore_DP0 pid=375431) File "/root/code/vllm/vllm/model_executor/models/qwen3.py", line 344, in load_weights (EngineCore_DP0 pid=375431) return loader.load_weights(weights) (EngineCore_DP0 pid=375431) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (EngineCore_DP0 pid=375431) File "/root/code/vllm/vllm/model_executor/models/utils.py", line 291, in load_weights (EngineCore_DP0 pid=375431) autoloaded_weights = set(self._load_module("", self.module, weights)) (EngineCore_DP0 pid=375431) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (EngineCore_DP0 pid=375431) File "/root/code/vllm/vllm/model_executor/models/utils.py", line 249, in _load_module (EngineCore_DP0 pid=375431) yield from self._load_module(prefix, (EngineCore_DP0 pid=375431) File "/root/code/vllm/vllm/model_executor/models/utils.py", line 222, in _load_module (EngineCore_DP0 pid=375431) loaded_params = module_load_weights(weights) (EngineCore_DP0 pid=375431) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (EngineCore_DP0 pid=375431) File "/root/code/vllm/vllm/model_executor/models/qwen2.py", line 392, in load_weights (EngineCore_DP0 pid=375431) for name, loaded_weight in weights: (EngineCore_DP0 pid=375431) ^^^^^^^ (EngineCore_DP0 pid=375431) File "/root/code/vllm/vllm/model_executor/models/utils.py", line 136, in <genexpr> (EngineCore_DP0 pid=375431) for parts, weights_data in group), (EngineCore_DP0 pid=375431) ^^^^^ (EngineCore_DP0 pid=375431) File "/root/code/vllm/vllm/model_executor/models/utils.py", line 127, in <genexpr> (EngineCore_DP0 pid=375431) for weight_name, weight_data in weights) (EngineCore_DP0 pid=375431) ^^^^^^^ (EngineCore_DP0 pid=375431) File "/root/code/vllm/vllm/model_executor/models/utils.py", line 288, in <genexpr> (EngineCore_DP0 pid=375431) weights = ((name, weight) for name, weight in weights (EngineCore_DP0 pid=375431) ^^^^^^^ (EngineCore_DP0 pid=375431) File "/root/code/vllm/vllm/model_executor/model_loader/weight_utils.py", line 595, in runai_safetensors_weights_iterator (EngineCore_DP0 pid=375431) yield from tensor_iter (EngineCore_DP0 pid=375431) File "/usr/local/lib/python3.12/dist-packages/tqdm/std.py", line 1181, in __iter__ (EngineCore_DP0 pid=375431) for obj in iterable: (EngineCore_DP0 pid=375431) ^^^^^^^^ (EngineCore_DP0 pid=375431) File "/usr/local/lib/python3.12/dist-packages/runai_model_streamer/safetensors_streamer/safetensors_streamer.py", line 84, in get_tensors (EngineCore_DP0 pid=375431) for file_path, ready_chunk_index, buffer in self.file_streamer.get_chunks(): (EngineCore_DP0 pid=375431) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (EngineCore_DP0 pid=375431) File "/usr/local/lib/python3.12/dist-packages/runai_model_streamer/file_streamer/file_streamer.py", line 116, in get_chunks (EngineCore_DP0 pid=375431) yield from self.request_ready_chunks() (EngineCore_DP0 pid=375431) File "/usr/local/lib/python3.12/dist-packages/runai_model_streamer/file_streamer/file_streamer.py", line 137, in request_ready_chunks (EngineCore_DP0 pid=375431) file_relative_index, chunk_relative_index = runai_response(self.streamer) (EngineCore_DP0 pid=375431) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (EngineCore_DP0 pid=375431) File "/usr/local/lib/python3.12/dist-packages/runai_model_streamer/libstreamer/libstreamer.py", line 93, in runai_response (EngineCore_DP0 pid=375431) raise Exception( (EngineCore_DP0 pid=375431) Exception: Could not receive runai_response from libstreamer due to: b'File access error'

Can you provide more details on how you encountered this error (eg: command that you ran that resulted in this)? And did the same command work successfully with an older version of the model streamer? From the error, this looks like an authentication/permission problem.

I use runai-model-streamer==0.13.0 this version is can running vllm, but upgrade version to 0.14.0 will have this error.

I can access minio, like this describe #23845 (comment), Not sure if it is related to this bug run-ai/runai-model-streamer#81

Make sure you upgrade runai-model-streamer-s3 to version 0.14.0 as well

Sorry, I didn't make it clear.
I use runai-model-streamer==0.13.0+ runai-model-streamer-s3==0.14.0 version can success running. but use runai-model-streamer==0.14.0+ runai-model-streamer-s3==0.14.0 will have a error "b'File access error'"

mergify · 2025-09-16T22:15:14Z

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @pwschuurman.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

tests/model_executor/model_loader/runai_model_streamer/test_runai_utils.py

hmellor · 2025-09-18T10:29:15Z

@22quinn could you please give this a review? Thanks!

22quinn

sorry missed this. LGTM.
Seems this test is currently not running in CI? Can we add it in: https://github.com/vllm-project/vllm/blob/main/.buildkite/test-pipeline.yaml

Separately, we should consolidate all model loader related tests.

pwschuurman · 2025-09-29T12:51:30Z

@DarkLight1337 I rebased this PR with your changes from #25765. Thanks for simplifying and wiring up the tests!

Signed-off-by: Peter Schuurman <psch@google.com>

…e Cloud Storage Signed-off-by: Peter Schuurman <psch@google.com>

Signed-off-by: Peter Schuurman <psch@google.com>

Signed-off-by: Peter Schuurman <psch@google.com> Signed-off-by: yewentao256 <zhyanwentao@126.com>

Signed-off-by: Peter Schuurman <psch@google.com> Signed-off-by: Tomer Asida <57313761+tomeras91@users.noreply.github.com>

Signed-off-by: Peter Schuurman <psch@google.com>

Signed-off-by: Peter Schuurman <psch@google.com> Signed-off-by: xuebwang-amd <xuebwang@amd.com>

Signed-off-by: Peter Schuurman <psch@google.com>

Signed-off-by: Peter Schuurman <psch@google.com> Signed-off-by: xuebwang-amd <xuebwang@amd.com>

pwschuurman requested a review from hmellor as a code owner September 15, 2025 20:19

mergify bot added documentation Improvements or additions to documentation ci/build rocm Related to AMD ROCm labels Sep 15, 2025

gemini-code-assist bot reviewed Sep 15, 2025

View reviewed changes

lengrongfu reviewed Sep 16, 2025

View reviewed changes

requirements/nightly_torch_test.txt Outdated Show resolved Hide resolved

lengrongfu reviewed Sep 16, 2025

View reviewed changes

docs/models/extensions/runai_model_streamer.md Outdated Show resolved Hide resolved

lengrongfu reviewed Sep 16, 2025

View reviewed changes

mergify bot added the needs-rebase label Sep 16, 2025

pwschuurman force-pushed the runai-version-update branch from bd80334 to eb0ea15 Compare September 16, 2025 22:17

mergify bot removed the needs-rebase label Sep 16, 2025

omer-dayan reviewed Sep 17, 2025

View reviewed changes

tests/model_executor/model_loader/runai_model_streamer/test_runai_utils.py Show resolved Hide resolved

omer-dayan approved these changes Sep 17, 2025

View reviewed changes

hmellor requested a review from 22quinn September 18, 2025 10:29

22quinn approved these changes Sep 24, 2025

View reviewed changes

hmellor added the ready ONLY add when PR is ready to merge/full CI is needed label Sep 26, 2025

pwschuurman force-pushed the runai-version-update branch from eb0ea15 to 909cfe3 Compare September 29, 2025 12:46

pwschuurman changed the title ~~Runai version update~~ Run:ai model streamer add GCS package support Sep 29, 2025

pwschuurman force-pushed the runai-version-update branch 2 times, most recently from d2d6e87 to af4cedf Compare September 30, 2025 16:54

pwschuurman added 3 commits October 1, 2025 08:38

Update default version for Run:ai model streamer from 0.11.0 -> 0.14.0

019498e

Signed-off-by: Peter Schuurman <psch@google.com>

Update documentation for Run:ai model streamer for loading from Googl…

c324863

…e Cloud Storage Signed-off-by: Peter Schuurman <psch@google.com>

Add GCS ObjectStorageModel.pull_files test

b1cb285

Signed-off-by: Peter Schuurman <psch@google.com>

pwschuurman force-pushed the runai-version-update branch from af4cedf to b1cb285 Compare October 1, 2025 15:39

22quinn merged commit be22bb6 into vllm-project:main Oct 2, 2025
7 checks passed

pdasigi pushed a commit to pdasigi/vllm that referenced this pull request Oct 2, 2025

Run:ai model streamer add GCS package support (vllm-project#24909)

dac1ec3

Signed-off-by: Peter Schuurman <psch@google.com>

yewentao256 pushed a commit that referenced this pull request Oct 3, 2025

Run:ai model streamer add GCS package support (#24909)

1150190

Signed-off-by: Peter Schuurman <psch@google.com> Signed-off-by: yewentao256 <zhyanwentao@126.com>

tomeras91 pushed a commit to tomeras91/vllm that referenced this pull request Oct 6, 2025

Run:ai model streamer add GCS package support (vllm-project#24909)

ec625a7

Signed-off-by: Peter Schuurman <psch@google.com> Signed-off-by: Tomer Asida <57313761+tomeras91@users.noreply.github.com>

southfreebird pushed a commit to southfreebird/vllm that referenced this pull request Oct 7, 2025

Run:ai model streamer add GCS package support (vllm-project#24909)

03a612f

Signed-off-by: Peter Schuurman <psch@google.com>

pwschuurman mentioned this pull request Oct 9, 2025

Update Dockerfile and install runai-model-streamer[gcs] package #26464

Merged

xuebwang-amd pushed a commit to xuebwang-amd/vllm that referenced this pull request Oct 10, 2025

Run:ai model streamer add GCS package support (vllm-project#24909)

99d52bc

Signed-off-by: Peter Schuurman <psch@google.com> Signed-off-by: xuebwang-amd <xuebwang@amd.com>

lywa1998 pushed a commit to lywa1998/vllm that referenced this pull request Oct 20, 2025

Run:ai model streamer add GCS package support (vllm-project#24909)

af3ab23

Signed-off-by: Peter Schuurman <psch@google.com>

alhridoy pushed a commit to alhridoy/vllm that referenced this pull request Oct 24, 2025

Run:ai model streamer add GCS package support (vllm-project#24909)

d8e6ae3

Signed-off-by: Peter Schuurman <psch@google.com>

xuebwang-amd pushed a commit to xuebwang-amd/vllm that referenced this pull request Oct 24, 2025

Run:ai model streamer add GCS package support (vllm-project#24909)

43887c9

Signed-off-by: Peter Schuurman <psch@google.com> Signed-off-by: xuebwang-amd <xuebwang@amd.com>

Uh oh!

Run:ai model streamer add GCS package support #24909

Run:ai model streamer add GCS package support #24909

Uh oh!

Conversation

pwschuurman commented Sep 15, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Plan / Test Result

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

pwschuurman commented Sep 15, 2025

Uh oh!

Uh oh!

Uh oh!

lengrongfu Sep 16, 2025

Choose a reason for hiding this comment

Uh oh!

pwschuurman Sep 16, 2025

Choose a reason for hiding this comment

Uh oh!

lengrongfu Sep 17, 2025

Choose a reason for hiding this comment

Uh oh!

omer-dayan Sep 17, 2025

Choose a reason for hiding this comment

Uh oh!

lengrongfu Sep 17, 2025

Choose a reason for hiding this comment

Uh oh!

mergify bot commented Sep 16, 2025

Uh oh!

Uh oh!

hmellor commented Sep 18, 2025

Uh oh!

22quinn left a comment

Choose a reason for hiding this comment

Uh oh!

pwschuurman commented Sep 29, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

pwschuurman commented Sep 15, 2025 •

edited by github-actions bot

Loading