-
-
Notifications
You must be signed in to change notification settings - Fork 11k
[Misc] Have AsyncLLM custom_stat_loggers extend default logger list
#20952
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: Seiji Eicher <seiji@anyscale.com>
|
👋 Hi! Thank you for contributing to the vLLM project. 💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels. Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging. To run CI, PR reviewers can either: Add 🚀 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Summary of Changes
Hello @eicherseiji, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!
This pull request refactors how statistical loggers are integrated into the AsyncLLM engine by introducing a dedicated asynchronous API for adding them. This change provides more flexibility in managing loggers and includes built-in validation to prevent common misconfigurations.
Highlights
- New API for Logger Management: Introduced a new asynchronous method,
add_logger, to theAsyncLLMclass. This method allows for dynamically adding statistical loggers to the engine after its initialization. - Refactored Logger Initialization: The way statistical loggers are added to
AsyncLLMhas been changed. Instead of passingstat_loggersdirectly to theAsyncLLM.from_engine_argsconstructor, loggers are now added via the newadd_loggermethod post-instantiation. - Logger Validation and Duplication Prevention: The
add_loggermethod includes checks to ensure that stat logging is enabled before adding a logger. It also prevents the addition of multiple loggers of the same type, raising aKeyErrorif a duplicate is detected.
Using Gemini Code Assist
The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.
Invoking Gemini
You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.
| Feature | Command | Description |
|---|---|---|
| Code Review | /gemini review |
Performs a code review for the current pull request in its current state. |
| Pull Request Summary | /gemini summary |
Provides a summary of the current pull request in its current state. |
| Comment | @gemini-code-assist | Responds in comments when explicitly tagged, both in issue comments and review comments. |
| Help | /gemini help |
Displays a list of available commands. |
Customization
To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.
Limitations & Feedback
Gemini Code Assist is currently in preview and may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments to provide feedback.
You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.
Footnotes
-
Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces a new add_logger API to AsyncLLM, allowing for dynamic addition of stat loggers after engine initialization. The change is well-contained and includes a corresponding update to the tests to use the new API.
I've found a critical issue in the implementation of the duplicate logger check within the new add_logger method, which I've detailed in a specific comment. Addressing this will ensure the new feature works as intended.
Signed-off-by: Seiji Eicher <seiji@anyscale.com>
|
Why do we want this? I think we should be careful about adding a new API like this before 1.0 |
Hi @robertgshaw2-redhat! Thanks for taking a look. I updated the PR body with a more detailed problem statement. Basically my perspective is that it should be possible to add additional loggers without implicitly removing default ones. Totally understand if this is not a safe change to make just yet, though. Are there docs/discussion somewhere I can reference for more background on our unstable/alpha API designs pre-1.0? Or maybe just extra context why this is too early? |
|
This pull request has merge conflicts that must be resolved before it can be |
|
@eicherseiji this needs to be revisited considering #21257 ? |
Yes I think the current interface isn't ideal in this regard. Either you default to built-in ones or provide completely separate list. IMO we should address this via the constructor args though instead of a new method. Perhaps even a dedicated flag to toggle whether the default statsloggers should be included. |
Signed-off-by: Seiji Eicher <seiji@anyscale.com>
Signed-off-by: Seiji Eicher <seiji@anyscale.com>
|
Thanks all for the feedback so far. Indeed @ruisearch42, #21257 changed things slightly:
@njhill I think the suggestion makes sense, the drawback is that we pollute the Otherwise, I see |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. Please make sure @robertgshaw2-redhat @njhill agree with the design change.
|
Hi @eicherseiji @ruisearch42 @njhill @robertgshaw2-redhat, I’ve opened this PR to extend (Initial plugin-based commit: #22456) I’m wondering how this should integrate with the current implementation and the changes here. Since I’m planning to introduce a dedicated plugin group that dynamically loads NOTE: This PR fixes the limitation of having to choose between default or custom loggers. That’s a great improvement, and the plugin system will need to align with this change as well—since Appreciate any thoughts, and am happy to adjust based on the direction you think is best. |
|
Thanks @ptovam for helping coordinate! @njhill seems like the constructor/non-method implementation you proposed may be more compatible with the plugin system than the |
|
Thanks! I like the idea of a new config field, as it could also make it easier to pass parameters (port, path, etc.) to custom loggers in the future — similar to how |
|
Thanks @eicherseiji @ptovam and sorry for taking so long to get back to this. For an immediate change WDYT about the following:
I think that will be minimally disruptive to existing users and address your original concern without requiring API additions. Actually there is another open PR #22227 which addresses incompatibility of the logging stats logger with DP and in its current state also addresses (1). Then as a follow-on we can decide how to extend this to enable plugging in custom loggers in the server case, i.e. #22456. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @eicherseiji sorry for the checking again, I'll try to turn around quicker...
vllm/v1/metrics/loggers.py
Outdated
| self.dp_shared_loggers = [] | ||
| if enable_default_loggers: | ||
| self.dp_shared_loggers.append( | ||
| PrometheusStatLogger(vllm_config, engine_idxs)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any reason not to keep this as a single prometheus_logger field?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense, addressed in https://github.com/vllm-project/vllm/pull/20952/files#r2271790049
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the above self.dp_shared_loggers code can also be removed now? It's not used right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup, thanks.
Co-authored-by: Nick Hill <nhill@redhat.com> Signed-off-by: Seiji Eicher <58963096+eicherseiji@users.noreply.github.com>
Signed-off-by: Seiji Eicher <seiji@anyscale.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @njhill for the comments, my apologies for the delayed reply (OOO + urgent work last week). Incorporated the feedback, lmk what you think.
vllm/v1/metrics/loggers.py
Outdated
| if enable_default_loggers and logger.isEnabledFor(logging.INFO): | ||
| factories.append(LoggingStatLogger) | ||
|
|
||
| # For Prometheus, need to share the metrics between EngineCores. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Point taken. I think it makes sense to keep the change minimal/narrow for now and defer a dp_shared_loggers concept until it's clearly needed; keeping the single prometheus_logger field.
vllm/v1/metrics/loggers.py
Outdated
| self.dp_shared_loggers = [] | ||
| if enable_default_loggers: | ||
| self.dp_shared_loggers.append( | ||
| PrometheusStatLogger(vllm_config, engine_idxs)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense, addressed in https://github.com/vllm-project/vllm/pull/20952/files#r2271790049
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @eicherseiji, no worries about the delay. I have one remaining comment
vllm/v1/metrics/loggers.py
Outdated
| self.dp_shared_loggers = [] | ||
| if enable_default_loggers: | ||
| self.dp_shared_loggers.append( | ||
| PrometheusStatLogger(vllm_config, engine_idxs)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the above self.dp_shared_loggers code can also be removed now? It's not used right?
Signed-off-by: Seiji Eicher <seiji@anyscale.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated tests.
vllm/v1/metrics/loggers.py
Outdated
| self.dp_shared_loggers = [] | ||
| if enable_default_loggers: | ||
| self.dp_shared_loggers.append( | ||
| PrometheusStatLogger(vllm_config, engine_idxs)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup, thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @eicherseiji
|
@eicherseiji looks like there is a test failure, presumably test needs updating: https://buildkite.com/vllm/ci/builds/29369#0199110d-18c2-49ab-9820-6c9995a41de8 |
Signed-off-by: Seiji Eicher <seiji@anyscale.com>
|
Thanks @njhill! Fixed. |
custom_stat_loggers extend default logger list
…vllm-project#20952) Signed-off-by: Seiji Eicher <seiji@anyscale.com> Signed-off-by: Seiji Eicher <58963096+eicherseiji@users.noreply.github.com> Co-authored-by: Nick Hill <nhill@redhat.com>
…vllm-project#20952) Signed-off-by: Seiji Eicher <seiji@anyscale.com> Signed-off-by: Seiji Eicher <58963096+eicherseiji@users.noreply.github.com> Co-authored-by: Nick Hill <nhill@redhat.com>
…vllm-project#20952) Signed-off-by: Seiji Eicher <seiji@anyscale.com> Signed-off-by: Seiji Eicher <58963096+eicherseiji@users.noreply.github.com> Co-authored-by: Nick Hill <nhill@redhat.com>
…vllm-project#20952) Signed-off-by: Seiji Eicher <seiji@anyscale.com> Signed-off-by: Seiji Eicher <58963096+eicherseiji@users.noreply.github.com> Co-authored-by: Nick Hill <nhill@redhat.com> Signed-off-by: xuebwang-amd <xuebwang@amd.com>
…vllm-project#20952) Signed-off-by: Seiji Eicher <seiji@anyscale.com> Signed-off-by: Seiji Eicher <58963096+eicherseiji@users.noreply.github.com> Co-authored-by: Nick Hill <nhill@redhat.com> Signed-off-by: xuebwang-amd <xuebwang@amd.com>
Essential Elements of an Effective PR Description Checklist
supported_models.mdandexamplesfor a new model.Purpose
Problem: The current implementation forces users to choose between default loggers or custom loggers, but not both. It's possible to recreate the default logger list (and append a custom logger), but this creates a manual maintenance effort if the default loggers change.
Based on discussion in #14661, I see this was deprioritized to start, but thought it worth raising again as a minimally invasive ergonomics change. Would love to get thoughts on this. Thanks!
Implements feature request #17702.
Ref:
Test Plan
pytest -vs v1/metrics/test_engine_logger_apis.pyTest Result
(Optional) Documentation Update