-
Notifications
You must be signed in to change notification settings - Fork 26
♻️ Compatibility with vllm main #338
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
👋 Hi! Thank you for contributing to vLLM support on Spyre. Or this can be done with Now you are good to go 🚀 |
9074133 to
5ae4775
Compare
5ae4775 to
59ef9c9
Compare
|
The pooling task changes are already part of the embedding PR. There are more upstream changes around the pooling tasks and I think they should be part of a follow-up PR. I don't want to pile up more changes in a PR that is already big enough. |
|
Yeah I was trying to get a minimum set of changes in so that the tests pass against vllm:main. I think the pooling tasks are non-breaking, I need to get |
Based on upstream vllm changes in vllm-project/vllm#20588 Signed-off-by: Prashant Gupta <prashantgupta@us.ibm.com>
Signed-off-by: Prashant Gupta <prashantgupta@us.ibm.com>
652eaa5 to
a7cf3ab
Compare
| if hasattr(Pooler, "from_config_with_defaults"): | ||
| # TODO: remove this when we no longer support | ||
| # vllm version v0.9.2 | ||
| self.pooler = Pooler.from_config_with_defaults( | ||
| pooler_config, | ||
| pooling_type=PoolingType.CLS, | ||
| normalize=True, | ||
| softmax=False) | ||
| else: | ||
| self.pooler = Pooler.for_embed( | ||
| pooler_config=pooler_config, | ||
| default_pooling_type=PoolingType.CLS, | ||
| default_normalize=True, | ||
| default_softmax=False) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@maxdebayser I think just need this piece of code from your PR to enable vllm:main breaking changes, I've removed all other pooling changes from this PR. I'm fine in waiting for your PR to get in first in which case I'll rebase and remove this change
Signed-off-by: Prashant Gupta <prashantgupta@us.ibm.com>
| if "generate" in self.model_config.supported_tasks: | ||
| tasks.extend(["generate"]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ideally we would want this to be coming from the model directly:
if is_text_generation_model(model):
supported_tasks.append("generate")
but SpyreCausalLM doesn't seem to support it atm
type(self.model)
<class 'vllm_spyre.model_executor.model_loader.spyre.SpyreCausalLM'>
is_text_generation_model(self.model)
False
Signed-off-by: Prashant Gupta <prashantgupta@us.ibm.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm! thanks for taking that on!
Changes
Remove prompt adapter config
Implement
get_supported_tasksin model_runner for online APIImplementation for online API:
Related Issues
fix #336