[Bug]: Capture CudaGraph with LoRA

### Your current environment

<details>
<summary>The output of `python collect_env.py`</summary>

```text
Your output of `python collect_env.py` here
```

</details>


### 🐛 Describe the bug

when I use LoRA with enabel_eager=False(which means it should capture cudaGraph), I find the below code could cause problem(in vllm/vllm/worker/model_runner.py):

    if self.lora_config:
        lora_mapping = LoRAMapping(
            **dict(index_mapping=[0] * batch_size,
                   prompt_mapping=[0] * batch_size,
                   is_prefill=False))
        self.set_active_loras(set(), lora_mapping)

then I print `token_lora_indices` by `self.lora_manager._adapter_manager.punica_wrapper._token_lora_indices`, but only get `tensor([-1, -1, -1,  ...,  0,  0,  0], device='cuda:0')`. A token with LoRA_indices=-1 seems not right.

### Before submitting a new issue...

- [x] Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the [documentation page](https://docs.vllm.ai/en/latest/), which can answer lots of frequently asked questions.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Bug]: Capture CudaGraph with LoRA #15090

Your current environment

🐛 Describe the bug

Before submitting a new issue...

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

[Bug]: Capture CudaGraph with LoRA #15090

Description

Your current environment

🐛 Describe the bug

Before submitting a new issue...

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions