[Bug]: num_gpu_blocks metric is None in V1

### Your current environment

<details>
<summary>The output of `python collect_env.py`</summary>

```text
Your output of `python collect_env.py` here
```

</details>


### 🐛 Describe the bug

I am running vLLM openAI engine v0.8.1 on kubernetes (see the example yaml below). After vllm is running, I tried to get the metrics via 

```
kubectl port-forward <vllm pod> 8000

curl localhost:8000/metrics
```

And for vLLM V1, the `num_gpu_blocks` in `cache_config_info` is shown as `None`. After switching to vLLM V0, the metric shows the correct value. Is this a regression in V1?

```
# TYPE vllm:cache_config_info gauge
vllm:cache_config_info{block_size="16",cache_dtype="auto",calculate_kv_scales="False",cpu_offload_gb="0",enable_prefix_caching="True",gpu_memory_utilization="0.9",is_attention_free="False",num_cpu_blocks="None",num_gpu_blocks="None",num_gpu_blocks_override="None",sliding_window="None",swap_space_bytes="4294967296"} 1.0
```

```
 containers:
      - args:
        - --port
        - "8000"
        - --max-num-seqs
        - "2048"
        - --max_model_len
        - "4096"
        - --compilation-config
        - "3"
        - --tensor-parallel-size
        - "1"
        - --model
        - "meta-llama/Llama-2-7b-hf"
        - "--enable-lora"
        - "--max-loras"
        - "10"
        - "--max-cpu-loras"
        - "12"
        command:
        - python3
        - -m
        - vllm.entrypoints.openai.api_server
        env:
        - name: PORT
          value: "8000"
        - name: HUGGING_FACE_HUB_TOKEN
          valueFrom:
            secretKeyRef:
              key: token
              name: hf-token
        - name: VLLM_ALLOW_RUNTIME_LORA_UPDATING
          value: "true"
        - name: VLLM_USE_V1
          value: "1"
        image: vllm/vllm-openai:v0.8.1
```

### Before submitting a new issue...

- [x] Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the [documentation page](https://docs.vllm.ai/en/latest/), which can answer lots of frequently asked questions.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Bug]: num_gpu_blocks metric is None in V1 #15719

Your current environment

🐛 Describe the bug

Before submitting a new issue...

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

[Bug]: num_gpu_blocks metric is None in V1 #15719

Description

Your current environment

🐛 Describe the bug

Before submitting a new issue...

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions