Skip to content

[Bug]: Gemma model is giving empty responses with new version of docker image vllm-openai:v.8.5 #17718

@FernandoDorado

Description

@FernandoDorado

Current environment

Kubernetes Cluster on Azure with A100 GPUs

Bug

Hello team,

After upgrading the Docker image from vllm-openai:v0.8.4 to v0.8.5, I observed one issue when running the google/gemma-3-27b-it model (Hugging Face Model Link).

The model successfully returns metadata (e.g., finish reason, token usage), but the content field in the response is consistently an empty string. No changes were made to the Kubernetes deployment manifest apart from the image version bump.

When reverting to v0.8.4, the model responds correctly with expected text completions, confirming that the issue is specific to the new image version.

Steps to Reproduce:

  1. Deploy vllm-openai:v0.8.5 with the gemma-3-27b-it model.

  2. Send a chat completion request.

  3. Observe that the content field is empty in the response.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions