Adding support for VLLM server #495

aspctu · 2023-08-01T17:05:04Z

This PR adds support to deploy models with the vLLM server.

truss/contexts/image_builder/serving_image_builder.py

truss/templates/vllm/tgi.Dockerfile.jinja

bolasim

Please post a video demo in slack. Maybe update branch so lint is resolved?

bolasim · 2023-08-03T20:09:37Z

truss/contexts/image_builder/serving_image_builder.py

@@ -71,6 +71,41 @@ def create_tgi_build_dir(config: TrussConfig, build_dir: Path):
    supervisord_filepath.write_text(supervisord_contents)


+def create_vllm_build_dir(config: TrussConfig, build_dir: Path):
+    server_endpoint_config = {


nit: constant should be all caps

bolasim · 2023-08-03T20:10:14Z

truss/contexts/image_builder/serving_image_builder.py

+    )
+    nginx_template = read_template_from_fs(TEMPLATES_DIR, "vllm/proxy.conf.jinja")
+
+    dockerfile_content = dockerfile_template.render(hf_access_token=hf_access_token)


we should write a helpful function at this point. we've copied like code like 7 timesnow. I can do that in a followup.

* Cleanup old asyncio threadpool settings. * Add `version.txt` into build to prevent cache migrations (#490) * added hf token to dockerfile * add prints * move hf token to the part where its needed * add version.txt * added version.txt * remove unnecessary files * adding files back * reverting * Adding `allow_patterns` and `ignore_patterns` to the Hugging Face cache. (#480) * remove assert * both ignore and allow working * fix for verbose * remove prints * Fix streaming issues. * Added documentation for HF caching (#492) * added hf_cache docs * fix ticks * Update configuration.md * clean up Truss docs (#491) * clean up Truss docs * fix links and lints * lint * Comment updates. * Bump version. * Controlling supervisord retries (#496) * update live reload docs (#494) * update live reload docs * remove leading space * remove dead links from flan-t5 readme * Update README.md * Removing extraneous file (#498) * Enable Hugging Face secrets during build from Truss (#499) * added hf token to dockerfile * add prints * move hf token to the part where its needed * successfully mounting secrets * update cache warmer to grab secret * match data dir copy * bump pyproject * add os to cache_warmer * bump pyproject * add is_trusted * revert version * change names to be hf_access_token * rename is_trusted and use Path * Adding support for VLLM server (#495) --------- Co-authored-by: Sidharth Shanker <sid.shanker@baseten.co> Co-authored-by: Varun Shenoy <vnshenoy@stanford.edu> Co-authored-by: Philip Kiely - Baseten <98474633+philipkiely-baseten@users.noreply.github.com> Co-authored-by: joostinyi <63941848+joostinyi@users.noreply.github.com>

init

437beda

bolasim reviewed Aug 1, 2023

View reviewed changes

truss/contexts/image_builder/serving_image_builder.py Show resolved Hide resolved

truss/templates/vllm/tgi.Dockerfile.jinja Outdated Show resolved Hide resolved

aspctu added 5 commits August 1, 2023 18:16

adding files

a54f365

adding server py file

1d4e7b9

fix comment

2a6281a

removing server

b4c41b9

removing server

aa53435

bolasim approved these changes Aug 1, 2023

View reviewed changes

aspctu added 10 commits August 2, 2023 22:55

Merge branch 'main' into abuqader/vllm-support

bd4e1bd

bumping version

3d27609

changing underlying server

aad52a5

version

9c23deb

version

9d7480d

Add new routes

b473034

version

567293f

fixing model param name

3dba8d9

adding option to choose between chatcompletions and completions

9059a16

upgrading version

caf9fd3

bolasim approved these changes Aug 3, 2023

View reviewed changes

aspctu merged commit d495531 into main Aug 3, 2023

aspctu deleted the abuqader/vllm-support branch August 3, 2023 20:14

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adding support for VLLM server #495

Adding support for VLLM server #495

aspctu commented Aug 1, 2023 •

edited

Loading

bolasim left a comment

bolasim Aug 3, 2023

bolasim Aug 3, 2023

Adding support for VLLM server #495

Adding support for VLLM server #495

Conversation

aspctu commented Aug 1, 2023 • edited Loading

bolasim left a comment

Choose a reason for hiding this comment

bolasim Aug 3, 2023

Choose a reason for hiding this comment

bolasim Aug 3, 2023

Choose a reason for hiding this comment

aspctu commented Aug 1, 2023 •

edited

Loading