Skip to content

[Bug]: num_gpu_blocks metric is None in V1 #15719

@liu-cong

Description

@liu-cong

Your current environment

The output of `python collect_env.py`
Your output of `python collect_env.py` here

🐛 Describe the bug

I am running vLLM openAI engine v0.8.1 on kubernetes (see the example yaml below). After vllm is running, I tried to get the metrics via

kubectl port-forward <vllm pod> 8000

curl localhost:8000/metrics

And for vLLM V1, the num_gpu_blocks in cache_config_info is shown as None. After switching to vLLM V0, the metric shows the correct value. Is this a regression in V1?

# TYPE vllm:cache_config_info gauge
vllm:cache_config_info{block_size="16",cache_dtype="auto",calculate_kv_scales="False",cpu_offload_gb="0",enable_prefix_caching="True",gpu_memory_utilization="0.9",is_attention_free="False",num_cpu_blocks="None",num_gpu_blocks="None",num_gpu_blocks_override="None",sliding_window="None",swap_space_bytes="4294967296"} 1.0
 containers:
      - args:
        - --port
        - "8000"
        - --max-num-seqs
        - "2048"
        - --max_model_len
        - "4096"
        - --compilation-config
        - "3"
        - --tensor-parallel-size
        - "1"
        - --model
        - "meta-llama/Llama-2-7b-hf"
        - "--enable-lora"
        - "--max-loras"
        - "10"
        - "--max-cpu-loras"
        - "12"
        command:
        - python3
        - -m
        - vllm.entrypoints.openai.api_server
        env:
        - name: PORT
          value: "8000"
        - name: HUGGING_FACE_HUB_TOKEN
          valueFrom:
            secretKeyRef:
              key: token
              name: hf-token
        - name: VLLM_ALLOW_RUNTIME_LORA_UPDATING
          value: "true"
        - name: VLLM_USE_V1
          value: "1"
        image: vllm/vllm-openai:v0.8.1

Before submitting a new issue...

  • Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions