[Bug]: Gemma model is giving empty responses with new version of docker image vllm-openai:v.8.5

### Current environment

Kubernetes Cluster on Azure with A100 GPUs

### Bug

Hello team,

After upgrading the Docker image from vllm-openai:v0.8.4 to v0.8.5, I observed one issue when running the google/gemma-3-27b-it model ([Hugging Face Model Link](https://huggingface.co/google/gemma-3-27b-it)).

The model successfully returns metadata (e.g., finish reason, token usage), but the content field in the response is consistently an empty string. No changes were made to the Kubernetes deployment manifest apart from the image version bump.

When reverting to v0.8.4, the model responds correctly with expected text completions, confirming that the issue is specific to the new image version.

Steps to Reproduce:

1. Deploy vllm-openai:v0.8.5 with the gemma-3-27b-it model.

2. Send a chat completion request.

3. Observe that the content field is empty in the response.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Bug]: Gemma model is giving empty responses with new version of docker image vllm-openai:v.8.5 #17718

Current environment

Bug

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

[Bug]: Gemma model is giving empty responses with new version of docker image vllm-openai:v.8.5 #17718

Description

Current environment

Bug

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions