-
-
Couldn't load subscription status.
- Fork 10.9k
[V1][Metrics] Allow V1 AsyncLLM to use custom logger #14661
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
👋 Hi! Thank you for contributing to the vLLM project. 💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels. Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging. To run CI, PR reviewers can either: Add 🚀 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, this quite closely matches the V0 interface, but not quite - e.g. the record() method is new, and we're using new classes like IterationStats, and SchedulerStats
Personally, I'm not a fan of having all of this be part of the public API in such an ad-hoc way ... the logger behaviors and the stats classes etc. are all liable to change in future IMO, breaking users of the API
However, if this is something we want to retain from V0, then this PR looks reasonable
vllm/third_party/pynvml.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove this change from the PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the auto format for imports ordering from isort. Is this formatting not desired?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@liuzijing2014 the formatting change is to an unrelated file so shouldn't be included in the same PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, it is not desired
This code is copied from nvidia-ml-py and as per #12977:
If there are bugfixes in nvidia-ml-py in the future, we can periodically sync the code.
So, we would prefer not to modify this code from the original
We do use isort, but only using pre-commit and in .pre-commit-config.yaml we exclude this directory:
exclude: 'vllm/third_party/.*'
So, for example, if I checkout your branch and do:
$ pre-commit run --all-files isort
this file is not modified because of the exclude
Hope that helps!
vllm/v1/engine/async_llm.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about we move all of this into the loggers module so that we just do e.g.
self.stat_loggers = loggers.setup(vllm_config, stat_loggers)
|
Agree with @markmc these interfaces / data structures may still be in flux so we may want to be careful exposing as part of an external API at this stage. If we do it should be clearly documented as unstable. |
@markmc The main motivation is to allow users to have basic logging parity between V0 and V1 (v1 feedback thread ). We have some customized logging backend that we need to log stats to, and switching to V1 breaks the stats logging support for us. @njhill Proper documentation sounds reasonable. Besides inline code comments, is there any other place that you think we need to call this part out? |
|
This pull request has merge conflicts that must be resolved before it can be |
38bf2d6 to
c9b95af
Compare
|
See liuzijing2014#1 for some changes to finish this PR up. Hope that helps! Commits at https://github.com/markmc/vllm/commits/5caa8fbfb00e3a23f4d461fac2f26e059c5efd7e |
In liuzijing2014#1 I added: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@njhill Proper documentation sounds reasonable. Besides inline code comments, is there any other place that you think we need to call this part out?
I don't think so. The main place is in the docstring for the new stat_loggers arg to AsyncLLM (and also the add/remove methods but I would prefer to just omit those for now).
vllm/third_party/pynvml.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@liuzijing2014 the formatting change is to an unrelated file so shouldn't be included in the same PR.
vllm/v1/engine/async_llm.py
Outdated
| self.stat_loggers: dict[str, StatLoggerBase] = dict() | ||
| if self.log_stats: | ||
| if logger.isEnabledFor(logging.INFO): | ||
| self.stat_loggers.append(LoggingStatLogger()) | ||
| self.stat_loggers.append(PrometheusStatLogger(vllm_config)) | ||
| if stat_loggers is not None: | ||
| self.stat_loggers = stat_loggers | ||
| else: | ||
| setup_default_loggers(vllm_config, | ||
| logger.isEnabledFor(logging.INFO), | ||
| self.stat_loggers) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggest moving all of this logic into the setup function:
self.stat_loggers = setup_default_loggers(vllm_config, log_stats, stat_loggers)Also:
- No need to pass in
logger.isEnabledFor(logging.INFO), that can be done inside the function - It would probably be cleaner to eliminate
self.log_stats. Emptyself.stat_loggerscan serve that purpose (or possiblyNone)
vllm/v1/metrics/loggers.py
Outdated
| PrometheusStatLogger(vllm_config)) | ||
|
|
||
| if enable_stats_log: | ||
| stat_loggers[STANDARD_LOGGING_LOGGER_NAME] = (LoggingStatLogger()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| stat_loggers[STANDARD_LOGGING_LOGGER_NAME] = (LoggingStatLogger()) | |
| stat_loggers[STANDARD_LOGGING_LOGGER_NAME] = LoggingStatLogger() |
vllm/v1/metrics/loggers.py
Outdated
| STANDARD_LOGGING_LOGGER_NAME = "logging" | ||
| PROMETHEUS_LOGGING_LOGGER_NAME = "prometheus" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: unnecessarily long names
| STANDARD_LOGGING_LOGGER_NAME = "logging" | |
| PROMETHEUS_LOGGING_LOGGER_NAME = "prometheus" | |
| STANDARD_LOGGER_NAME = "logging" | |
| PROMETHEUS_LOGGER_NAME = "prometheus" |
vllm/v1/engine/async_llm.py
Outdated
| def add_logger(self, logger_name: str, logger: StatLoggerBase) -> None: | ||
| if not self.log_stats: | ||
| raise RuntimeError( | ||
| "Stat logging is disabled. Set `disable_log_stats=False` " | ||
| "argument to enable.") | ||
| if logger_name in self.stat_loggers: | ||
| raise KeyError(f"Logger with name {logger_name} already exists.") | ||
| self.stat_loggers[logger_name] = logger | ||
|
|
||
| def remove_logger(self, logger_name: str) -> None: | ||
| if not self.log_stats: | ||
| raise RuntimeError( | ||
| "Stat logging is disabled. Set `disable_log_stats=False` " | ||
| "argument to enable.") | ||
| if logger_name not in self.stat_loggers: | ||
| raise KeyError(f"Logger with name {logger_name} does not exist.") | ||
| del self.stat_loggers[logger_name] | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are these methods really necessary if you can pass in the dict?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I add them here to keep the interface on parity as V0. Either way could work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My preference would be to exclude them for now, since the API is different to V0 anyhow. We're trying to avoid as much unnecessary code in V1 as possible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @liuzijing2014 it looks good now, just have a couple more minor comments.
vllm/v1/engine/async_llm.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Move the logger.isEnabledFor(logging.INFO) check inside the method?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@liuzijing2014 WDYT about this suggestion?
|
@njhill FYI |
|
This pull request has merge conflicts that must be resolved before it can be |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@markmc @liuzijing2014 I think this is ready now?
86a1452 to
9f08bd5
Compare
Signed-off-by: Zijing Liu <liuzijing2014@gmail.com> Signed-off-by: Mark McLoughlin <markmc@redhat.com> Signed-off-by: Zijing Liu <liuzijing2014@gmail.com>
More general than passing sub-class types, requiring sub-class constructors to conform to a protocol. Signed-off-by: Mark McLoughlin <markmc@redhat.com> Signed-off-by: Zijing Liu <liuzijing2014@gmail.com>
Signed-off-by: Zijing Liu <liuzijing2014@gmail.com>
Signed-off-by: Nick Hill <nhill@redhat.com> Signed-off-by: Zijing Liu <liuzijing2014@gmail.com>
9f08bd5 to
303e7ef
Compare
) Signed-off-by: Zijing Liu <liuzijing2014@gmail.com> Signed-off-by: Mark McLoughlin <markmc@redhat.com> Signed-off-by: Nick Hill <nhill@redhat.com> Co-authored-by: Mark McLoughlin <markmc@redhat.com> Co-authored-by: Nick Hill <nhill@redhat.com> Signed-off-by: Agata Dobrzyniewicz <adobrzyniewicz@habana.ai>
Content
Update V1 AsynLLM to take in custom logger that extends
StatLoggerBase. This would help to achieve function on parity as V0 regarding customization support for logging.Follow V0 style on customizing loggers:
stat_loggersfrominit.add_loggerandremove_loggerfunctions.Test Plan