Skip to content

[CI Failure]: Quantized Models Test - models/quantization/test_gguf.py::test_models[1-5-32-half-model0] #19458

@mgoin

Description

@mgoin

Name of failing test

models/quantization/test_gguf.py::test_models[1-5-32-half-model0]

Basic information

  • Flaky test
  • Can reproduce locally
  • Caused by external libraries (e.g. bug in transformers)

🧪 Describe the failing test

This specific Llama 1B GGUF model test has been failing consistently in multiple PRs https://buildkite.com/vllm/ci/builds/21800/steps/waterfall?jid=01975af4-f581-4d43-a1e5-7175d960b2b7#01975af4-f581-4d43-a1e5-7175d960b2b7/212-6971


[2025-06-10T18:40:56Z] FAILED models/quantization/test_gguf.py::test_models[1-5-32-half-model0] - AssertionError: Test0:
[2025-06-10T18:40:56Z] Matched tokens:	[4897, 596, 4495, 13, 650, 4178, 44, 13656, 369]
[2025-06-10T18:40:56Z] original:	"That's correct. VLLM stands for Vision and Language Model, which is a type of large language model designed for both inference and serving. It's a"	{31541: Logprob(logprob=-1.6094070672988892, rank=1, decoded_token='ĠVision'), 28968: Logprob(logprob=-2.0000319480895996, rank=2, decoded_token='ĠVari'), 8519: Logprob(logprob=-2.5000319480895996, rank=3, decoded_token='ĠVideo'), 21382: Logprob(logprob=-2.6562819480895996, rank=4, decoded_token='ĠVirtual'), 20796: Logprob(logprob=-2.7187819480895996, rank=5, decoded_token='ĠVisual')}
[2025-06-10T18:40:56Z] gguf:	"That's correct. VLLM stands for Virtual Language Learning Model, which is a type of large language model designed for high-throughput and memory-efficient inference and"	{21382: Logprob(logprob=-1.9463169574737549, rank=1, decoded_token='ĠVirtual'), 330: Logprob(logprob=-2.274441957473755, rank=2, decoded_token='Ġ"'), 15668: Logprob(logprob=-2.383816957473755, rank=3, decoded_token='ĠVery'), 4196: Logprob(logprob=-2.446316957473755, rank=4, decoded_token='ĠVal'), 28968: Logprob(logprob=-2.540066957473755, rank=5, decoded_token='ĠVari')}

📝 History of failing test

Earliest failure I found was at Mon 26th May at 8:27 AM
[CI/Build] Split pooling and generation extended language models tests in CI (#18705)
https://buildkite.com/organizations/vllm/analytics/suites/ci-1/tests/94a54396-ec5f-8d47-8b48-6c88a2d4e5cb?period=28days&tags=scm.branch%3Amain&execution_id=01970c90-0b2c-7f2b-b3ad-d7bcc06f340b

CC List.

No response

Metadata

Metadata

Labels

ci-failureIssue about an unexpected test failure in CIstaleOver 90 days of inactivity

Type

No type

Projects

Status

Done

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions