-
-
Notifications
You must be signed in to change notification settings - Fork 5.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[CI/CD] fix: test metric tag setting #13717
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: Sungjae Lee <33976427+llsj14@users.noreply.github.com>
👋 Hi! Thank you for contributing to the vLLM project. 💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels. Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging. To run CI, PR reviewers can either: Add 🚀 |
@khluu @DarkLight1337 Could you check this PR please? |
where do you see the failures? This test passed on latest commit on main https://buildkite.com/vllm/ci/builds/14022#0195312e-1004-462a-984a-954af4465df4 |
I'm having test failures in this PR. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After this change, one test failure was resolved, but there are still test failures, so I'm investigating.
This is also an issue I encountered in the previous PR.
[2025-02-23T06:05:14Z] metrics/test_metrics.py:219:
--
| [2025-02-23T06:05:14Z] _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
| [2025-02-23T06:05:14Z]
| [2025-02-23T06:05:14Z] model = 's3://vllm-ci-model-weights/distilbert/distilgpt2'
| [2025-02-23T06:05:14Z] engine = <vllm.engine.llm_engine.LLMEngine object at 0x7fa8f1f259d0>
| [2025-02-23T06:05:14Z] disable_log_stats = False, num_requests = 8
| [2025-02-23T06:05:14Z]
| [2025-02-23T06:05:14Z] def assert_metrics(model: str, engine: LLMEngine, disable_log_stats: bool,
| [2025-02-23T06:05:14Z] num_requests: int) -> None:
| [2025-02-23T06:05:14Z] if disable_log_stats:
| [2025-02-23T06:05:14Z] with pytest.raises(AttributeError):
| [2025-02-23T06:05:14Z] _ = engine.stat_loggers
| [2025-02-23T06:05:14Z] else:
| [2025-02-23T06:05:14Z] assert (engine.stat_loggers
| [2025-02-23T06:05:14Z] is not None), "engine.stat_loggers should be set"
| [2025-02-23T06:05:14Z] # Ensure the count bucket of request-level histogram metrics matches
| [2025-02-23T06:05:14Z] # the number of requests as a simple sanity check to ensure metrics are
| [2025-02-23T06:05:14Z] # generated
| [2025-02-23T06:05:14Z] labels = {'model_name': model}
| [2025-02-23T06:05:14Z] request_histogram_metrics = [
| [2025-02-23T06:05:14Z] "vllm:e2e_request_latency_seconds",
| [2025-02-23T06:05:14Z] "vllm:request_prompt_tokens",
| [2025-02-23T06:05:14Z] "vllm:request_generation_tokens",
| [2025-02-23T06:05:14Z] "vllm:request_params_n",
| [2025-02-23T06:05:14Z] "vllm:request_params_max_tokens",
| [2025-02-23T06:05:14Z] ]
| [2025-02-23T06:05:14Z] for metric_name in request_histogram_metrics:
| [2025-02-23T06:05:14Z] metric_value = REGISTRY.get_sample_value(f"{metric_name}_count",
| [2025-02-23T06:05:14Z] labels)
| [2025-02-23T06:05:14Z] > assert (
| [2025-02-23T06:05:14Z] metric_value == num_requests), "Metrics should be collected"
| [2025-02-23T06:05:14Z] E AssertionError: Metrics should be collected
| [2025-02-23T06:05:14Z] E assert None == 8
| [2025-02-23T06:05:14Z]
| [2025-02-23T06:05:14Z] metrics/test_metrics.py:384: AssertionError
| [2025-02-23T06:05:14Z] =========================== short test summary info ============================
| [2025-02-23T06:05:14Z] FAILED metrics/test_metrics.py::test_engine_log_metrics_regression[False-4-half-distilbert/distilgpt2] - AssertionError: Metrics should be collected
| [2025-02-23T06:05:14Z] assert None == 8
Yea can you merge that PR with main? The fix for that test was merged to main a few hours ago |
Yeah, I merged the latest commit from #13278. |
I still have some failures while running |
#13716
There are some test failures in test_metrics.py. This PR aims to fix these errors.