Skip to content

[Bug][Failing Test]: LoRA 2 - lora/test_lora_functions.py::test_lora_functions_sync #18498

@DarkLight1337

Description

@DarkLight1337

Your current environment

N/A

🐛 Describe the bug

https://buildkite.com/vllm/ci/builds/20460/steps?jid=0196f343-0fdb-4d91-80da-728e0fb8174c

Summary:

[2025-05-21T16:00:09Z] FAILED lora/test_lora_functions.py::test_lora_functions_sync[True] - Exception: Call to add_lora method failed: CUDA error: an illegal memory access was encountered
[2025-05-21T16:00:09Z] CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
[2025-05-21T16:00:09Z] For debugging consider passing CUDA_LAUNCH_BLOCKING=1
[2025-05-21T16:00:09Z] Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.

Stack:

[2025-05-21T15:50:19Z] ERROR 05-21 08:50:19 [core.py:559] Invocation of add_lora method failed
[2025-05-21T15:50:19Z] ERROR 05-21 08:50:19 [core.py:559] Traceback (most recent call last):
[2025-05-21T15:50:19Z] ERROR 05-21 08:50:19 [core.py:559]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/core.py", line 556, in _handle_client_request
[2025-05-21T15:50:19Z] ERROR 05-21 08:50:19 [core.py:559]     output.result = method(
[2025-05-21T15:50:19Z] ERROR 05-21 08:50:19 [core.py:559]                     ^^^^^^^
[2025-05-21T15:50:19Z] ERROR 05-21 08:50:19 [core.py:559]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/core.py", line 314, in add_lora
[2025-05-21T15:50:19Z] ERROR 05-21 08:50:19 [core.py:559]     return self.model_executor.add_lora(lora_request)
[2025-05-21T15:50:19Z] ERROR 05-21 08:50:19 [core.py:559]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[2025-05-21T15:50:19Z] ERROR 05-21 08:50:19 [core.py:559]   File "/usr/local/lib/python3.12/dist-packages/vllm/executor/executor_base.py", line 150, in add_lora
[2025-05-21T15:50:19Z] ERROR 05-21 08:50:19 [core.py:559]     return all(self.collective_rpc("add_lora", args=(lora_request, )))
[2025-05-21T15:50:19Z] ERROR 05-21 08:50:19 [core.py:559]                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[2025-05-21T15:50:19Z] ERROR 05-21 08:50:19 [core.py:559]   File "/usr/local/lib/python3.12/dist-packages/vllm/executor/uniproc_executor.py", line 56, in collective_rpc
[2025-05-21T15:50:19Z] ERROR 05-21 08:50:19 [core.py:559]     answer = run_method(self.driver_worker, method, args, kwargs)
[2025-05-21T15:50:19Z] ERROR 05-21 08:50:19 [core.py:559]              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[2025-05-21T15:50:19Z] ERROR 05-21 08:50:19 [core.py:559]   File "/usr/local/lib/python3.12/dist-packages/vllm/utils.py", line 2598, in run_method
[2025-05-21T15:50:19Z] ERROR 05-21 08:50:19 [core.py:559]     return func(*args, **kwargs)
[2025-05-21T15:50:19Z] ERROR 05-21 08:50:19 [core.py:559]            ^^^^^^^^^^^^^^^^^^^^^
[2025-05-21T15:50:19Z] ERROR 05-21 08:50:19 [core.py:559]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/worker/gpu_worker.py", line 300, in add_lora
[2025-05-21T15:50:19Z] ERROR 05-21 08:50:19 [core.py:559]     return self.model_runner.add_lora(lora_request)
[2025-05-21T15:50:19Z] ERROR 05-21 08:50:19 [core.py:559]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[2025-05-21T15:50:19Z] ERROR 05-21 08:50:19 [core.py:559]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/worker/lora_model_runner_mixin.py", line 130, in add_lora
[2025-05-21T15:50:19Z] ERROR 05-21 08:50:19 [core.py:559]     return self.lora_manager.add_adapter(lora_request)
[2025-05-21T15:50:19Z] ERROR 05-21 08:50:19 [core.py:559]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[2025-05-21T15:50:19Z] ERROR 05-21 08:50:19 [core.py:559]   File "/usr/local/lib/python3.12/dist-packages/vllm/lora/worker_manager.py", line 235, in add_adapter
[2025-05-21T15:50:19Z] ERROR 05-21 08:50:19 [core.py:559]     lora = self._load_adapter(lora_request)
[2025-05-21T15:50:19Z] ERROR 05-21 08:50:19 [core.py:559]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[2025-05-21T15:50:19Z] ERROR 05-21 08:50:19 [core.py:559]   File "/usr/local/lib/python3.12/dist-packages/vllm/lora/worker_manager.py", line 141, in _load_adapter
[2025-05-21T15:50:19Z] ERROR 05-21 08:50:19 [core.py:559]     raise e
[2025-05-21T15:50:19Z] ERROR 05-21 08:50:19 [core.py:559]   File "/usr/local/lib/python3.12/dist-packages/vllm/lora/worker_manager.py", line 117, in _load_adapter
[2025-05-21T15:50:19Z] ERROR 05-21 08:50:19 [core.py:559]     lora = self._lora_model_cls.from_local_checkpoint(
[2025-05-21T15:50:19Z] ERROR 05-21 08:50:19 [core.py:559]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[2025-05-21T15:50:19Z] ERROR 05-21 08:50:19 [core.py:559]   File "/usr/local/lib/python3.12/dist-packages/vllm/lora/models.py", line 290, in from_local_checkpoint
[2025-05-21T15:50:19Z] ERROR 05-21 08:50:19 [core.py:559]     return cls.from_lora_tensors(
[2025-05-21T15:50:19Z] ERROR 05-21 08:50:19 [core.py:559]            ^^^^^^^^^^^^^^^^^^^^^^
[2025-05-21T15:50:19Z] ERROR 05-21 08:50:19 [core.py:559]   File "/usr/local/lib/python3.12/dist-packages/vllm/lora/models.py", line 145, in from_lora_tensors
[2025-05-21T15:50:19Z] ERROR 05-21 08:50:19 [core.py:559]     lora_embeddings_tensor.pin_memory())
[2025-05-21T15:50:19Z] ERROR 05-21 08:50:19 [core.py:559]     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[2025-05-21T15:50:19Z] ERROR 05-21 08:50:19 [core.py:559] RuntimeError: CUDA error: an illegal memory access was encountered
[2025-05-21T15:50:19Z] ERROR 05-21 08:50:19 [core.py:559] CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
[2025-05-21T15:50:19Z] ERROR 05-21 08:50:19 [core.py:559] For debugging consider passing CUDA_LAUNCH_BLOCKING=1
[2025-05-21T15:50:19Z] ERROR 05-21 08:50:19 [core.py:559] Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.

Before submitting a new issue...

  • Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.

Metadata

Metadata

Assignees

Labels

bugSomething isn't workingci-failureIssue about an unexpected test failure in CI

Type

No type

Projects

Status

Done

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions