-
Notifications
You must be signed in to change notification settings - Fork 305
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
vllm inference plugin #2967
vllm inference plugin #2967
Conversation
Signed-off-by: Daniel Sola <daniel.sola@union.ai>
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #2967 +/- ##
==========================================
+ Coverage 75.71% 76.45% +0.73%
==========================================
Files 214 200 -14
Lines 21598 20922 -676
Branches 2693 2694 +1
==========================================
- Hits 16352 15995 -357
+ Misses 4489 4202 -287
+ Partials 757 725 -32 ☔ View full report in Codecov by Sentry. |
This is huge! |
plugins/flytekit-inference/flytekitplugins/inference/vllm/serve.py
Outdated
Show resolved
Hide resolved
Signed-off-by: Daniel Sola <daniel.sola@union.ai>
mem: str = "10Gi", | ||
): | ||
""" | ||
Initialize NIM class for managing a Kubernetes pod template. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Initialize NIM class for managing a Kubernetes pod template. | |
Initialize VLLM class for managing a Kubernetes pod template. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lovely!
This would work really well with actors on union |
* Store protos in local cache (#3022) * Store proto obj instead of model Literal in local cache Signed-off-by: Eduardo Apolinario <eapolinario@users.noreply.github.com> * Remove unused file Signed-off-by: Eduardo Apolinario <eapolinario@users.noreply.github.com> --------- Signed-off-by: Eduardo Apolinario <eapolinario@users.noreply.github.com> Co-authored-by: Eduardo Apolinario <eapolinario@users.noreply.github.com> * Bump aiohttp from 3.9.5 to 3.10.11 (#3018) Bumps [aiohttp](https://github.com/aio-libs/aiohttp) from 3.9.5 to 3.10.11. - [Release notes](https://github.com/aio-libs/aiohttp/releases) - [Changelog](https://github.com/aio-libs/aiohttp/blob/master/CHANGES.rst) - [Commits](aio-libs/aiohttp@v3.9.5...v3.10.11) --- updated-dependencies: - dependency-name: aiohttp dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Fix bug in FlyteDirectory.listdir on local files (#2926) * Fix issue in FlyteDirectory.listdir Fixes flyteorg/flyte#6005 Signed-off-by: Pim de Haan <pim@cusp.ai> * Added test Signed-off-by: Pim de Haan <pim@cusp.ai> * Run make lint Signed-off-by: Eduardo Apolinario <eapolinario@users.noreply.github.com> --------- Signed-off-by: Pim de Haan <pim@cusp.ai> Signed-off-by: Eduardo Apolinario <eapolinario@users.noreply.github.com> Co-authored-by: Eduardo Apolinario <eapolinario@users.noreply.github.com> * Fix unit tests in airflow plugin (#3024) Signed-off-by: Kevin Su <pingsutw@apache.org> * fix: Fix resource meta typos for async agent (#3023) Signed-off-by: JiaWei Jiang <waynechuang97@gmail.com> * fix: format commands output (#3026) * Fix pydantic basemodel default input (#3013) * Fix pydantic default input Signed-off-by: Future-Outlier <eric901201@gmail.com> * add pydantic integration test Signed-off-by: Future-Outlier <eric901201@gmail.com> * Use duck typing by Thomas's advice Signed-off-by: Future-Outlier <eric901201@gmail.com> Co-authored-by: Thomas J. Fan <thomasjpfan@gmail.com> * lint Signed-off-by: Future-Outlier <eric901201@gmail.com> --------- Signed-off-by: Future-Outlier <eric901201@gmail.com> Co-authored-by: Thomas J. Fan <thomasjpfan@gmail.com> * [BUG] Open FlyteFile from remote path (#2991) * fix: Open FlyteFile from remote path Signed-off-by: JiaWei Jiang <waynechuang97@gmail.com> * Add integration test Signed-off-by: JiaWei Jiang <waynechuang97@gmail.com> * refactor: Use ctx as param instead of recreation Signed-off-by: JiaWei Jiang <waynechuang97@gmail.com> * refactor: Clean test logic 1. Remove redundant prints 2. Use `mock.patch.dict` to setup `os.environ` for the current test fn * Avoid contaminating other tests running in the same process Signed-off-by: JiaWei Jiang <waynechuang97@gmail.com> * refactor: Setup local path and downloader in constructor Signed-off-by: JiaWei Jiang <waynechuang97@gmail.com> * refactor: Move SimpleFileTransfer to an utility file Signed-off-by: JiaWei Jiang <waynechuang97@gmail.com> * Remove redundant env var setup Please refer to #3001 Signed-off-by: JiaWei Jiang <waynechuang97@gmail.com> * test: Add another ff use case Create ff in one task pod and read it in another task pod. Signed-off-by: JiaWei Jiang <waynechuang97@gmail.com> --------- Signed-off-by: JiaWei Jiang <waynechuang97@gmail.com> * vllm inference plugin (#2967) * vllm inference plugin Signed-off-by: Daniel Sola <daniel.sola@union.ai> * fixed default value Signed-off-by: Daniel Sola <daniel.sola@union.ai> --------- Signed-off-by: Daniel Sola <daniel.sola@union.ai> * Add poetry to image spec (#3025) * Add poetry to image spec Signed-off-by: Thomas J. Fan <thomasjpfan@gmail.com> * Add stricter check Signed-off-by: Thomas J. Fan <thomasjpfan@gmail.com> --------- Signed-off-by: Thomas J. Fan <thomasjpfan@gmail.com> * [test] Add integration test for accessing sd sttr in dc (#2969) * test: Add integration test for attr access of sd Signed-off-by: JiaWei Jiang <waynechuang97@gmail.com> * Correct file path Signed-off-by: JiaWei Jiang <waynechuang97@gmail.com> * test: Support interaction with minio s3 bucket 1. Upload a local parquet file to minio s3 bucket 2. Access StructuredDataset attr from a dataclass 3. Open StructuredDataset from a remote path Signed-off-by: JiaWei Jiang <waynechuang97@gmail.com> * Delete an unmerged integration test Signed-off-by: JiaWei Jiang <waynechuang97@gmail.com> * Try imagespec with commit sha of corresponding fix Signed-off-by: JiaWei Jiang <waynechuang97@gmail.com> * Remove redundant test Signed-off-by: JiaWei Jiang <waynechuang97@gmail.com> * Remove default_factory and create sd dc from input uri Signed-off-by: JiaWei Jiang <waynechuang97@gmail.com> * refactor: Clean test logic 1. Remove redundant prints 2. Use `mock.patch.dict` to setup `os.environ` for the current test fn * Avoid contaminating other tests running in the same process Signed-off-by: JiaWei Jiang <waynechuang97@gmail.com> * Remove redundant minio env var setup and add test comments Signed-off-by: JiaWei Jiang <waynechuang97@gmail.com> * Support uploading tmp pqt file Signed-off-by: JiaWei Jiang <waynechuang97@gmail.com> * Udpate deprecated module Signed-off-by: JiaWei Jiang <waynechuang97@gmail.com> * Remove redundant and unused imports Signed-off-by: JiaWei Jiang <waynechuang97@gmail.com> --------- Signed-off-by: JiaWei Jiang <waynechuang97@gmail.com> --------- Signed-off-by: Eduardo Apolinario <eapolinario@users.noreply.github.com> Signed-off-by: dependabot[bot] <support@github.com> Signed-off-by: Pim de Haan <pim@cusp.ai> Signed-off-by: Kevin Su <pingsutw@apache.org> Signed-off-by: JiaWei Jiang <waynechuang97@gmail.com> Signed-off-by: Future-Outlier <eric901201@gmail.com> Signed-off-by: Daniel Sola <daniel.sola@union.ai> Signed-off-by: Thomas J. Fan <thomasjpfan@gmail.com> Co-authored-by: Eduardo Apolinario <653394+eapolinario@users.noreply.github.com> Co-authored-by: Eduardo Apolinario <eapolinario@users.noreply.github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Pim de Haan <pimdehaan@gmail.com> Co-authored-by: Kevin Su <pingsutw@apache.org> Co-authored-by: 江家瑋 <36886416+JiangJiaWei1103@users.noreply.github.com> Co-authored-by: V <0426vincent@gmail.com> Co-authored-by: Han-Ru Chen (Future-Outlier) <eric901201@gmail.com> Co-authored-by: Thomas J. Fan <thomasjpfan@gmail.com> Co-authored-by: Daniel Sola <40698988+dansola@users.noreply.github.com>
* vllm inference plugin Signed-off-by: Daniel Sola <daniel.sola@union.ai> * fixed default value Signed-off-by: Daniel Sola <daniel.sola@union.ai> --------- Signed-off-by: Daniel Sola <daniel.sola@union.ai> Signed-off-by: Shuying Liang <shuying.liang@gmail.com>
Why are the changes needed?
A vllm addition to the existing
flytekitplugins-inference
plugin which already has NIM and ollama.What changes were proposed in this pull request?
A vllm plugin that lets you easily create a pod template to serve a vllm in an init container for a flyte task. User passes a hugging face secret name and the model in hugging face they want to serve.
How was this patch tested?
Unit tests and running a remote workflow from the README.
Setup process
Screenshots
Check all the applicable boxes
Related PRs
Docs link