Bundling Private Weights from GCP #552

varunshenoy · 2023-08-14T18:16:27Z

Flow:

Place service_account.json file in your data directory.
Pass gs://... bucket for repo_id under hf_cache.
Add google-cloud-storage under requirements.
It will download the weights into the app/hf_cache/{{bucket_name}} directory.

Takes around ~180s to build the image for Llama 2 7B.

Next PR will be focused on documenting this feature. We should also rename hf_cache to something possibly more general. Maybe bucket_cache or model_cache? I don't want people to confuse it with external_data.

…into varun/gcs-cache

bolasim · 2023-08-16T23:54:14Z

examples/vllm-gcs/config.yaml

+build:
+  arguments:
+    endpoint: Completions
+    model: /app/hf_cache/llama-2-7b


can we just make the model gs://varuns-llama2-whatever and then do the swap for the args under the hood so the user doesn't have to worry about it and to add the hf_cache option below as well?

What if the model was a private HF model? What would the expected behavior be then?

truss/contexts/image_builder/serving_image_builder.py

bolasim · 2023-08-16T23:57:44Z

truss/contexts/image_builder/serving_image_builder.py

@@ -91,7 +99,38 @@ def create_vllm_build_dir(config: TrussConfig, build_dir: Path):
    )
    nginx_template = read_template_from_fs(TEMPLATES_DIR, "vllm/proxy.conf.jinja")

-    dockerfile_content = dockerfile_template.render(hf_access_token=hf_access_token)
+    (build_dir / "cache_requirements.txt").write_text(spec.requirements_txt)


Let's not do this. I think we should make a cache_requirements.txt in templates/ and just copy it directly. It should have the google client and huggingface hub

add hf_transfer package to huggingface and set environment variable

bolasim · 2023-08-16T23:58:35Z

truss/templates/vllm/vllm.Dockerfile.jinja

+
+{%- if hf_cache != None %}
+COPY ./cache_warmer.py /cache_warmer.py
+RUN chmod +x /cache_warmer.py


this line is not necessary since you're doing python3 /cache_warmer.py below. +x is only needs if you wanna execute the file directly

truss/templates/vllm/vllm.Dockerfile.jinja

bolasim

approved for when you add the missing requirements file. Good work!

bolasim · 2023-08-17T23:06:34Z

truss/contexts/image_builder/cache_warmer.py

+
+        # Connect to GCS storage
+        try:
+            storage_client = storage.Client.from_service_account_json(key_file)


to optimize later: would be great if we only make this client once and re-use it for all the file downloads.

bolasim · 2023-08-17T23:08:41Z

truss/contexts/image_builder/serving_image_builder.py

@@ -75,7 +82,14 @@ def create_tgi_build_dir(config: TrussConfig, build_dir: Path):
    supervisord_filepath.write_text(supervisord_contents)


-def create_vllm_build_dir(config: TrussConfig, build_dir: Path):
+def create_vllm_build_dir(
+    config: TrussConfig, build_dir: Path, truss_dir: Path, spec: TrussSpec


nit: spec is unnecesarry to add here.

bolasim · 2023-08-17T23:09:02Z

truss/contexts/image_builder/serving_image_builder.py

+            filtered_repo_files = list(
+                filter_repo_objects(
+                    items=list_files(
+                        repo_id, truss_dir / spec.config.data_dir, revision=revision


you can just use config.data_dir instead of spec and drop the extra arg

This reverts commit 0a52e52.

Varun Shenoy added 3 commits August 14, 2023 18:08

working gcp bundling

88252ee

update save dir to cache_dir / {bucket}

5df05f3

change toml

be73601

varunshenoy requested a review from bolasim August 14, 2023 19:29

update poetry.lock

b02228d

varunshenoy force-pushed the varun/gcs-cache branch from 8d5deb5 to b02228d Compare August 15, 2023 19:34

Varun Shenoy added 7 commits August 16, 2023 00:02

update pyproject

d2ada8c

bump

46a9487

vllm working

03fd6a0

cleaned up example

ccbc3a1

Merge branch 'main' into varun/gcs-cache

9430a43

cleanup

c9eb8ba

Merge branch 'varun/gcs-cache' of https://github.com/basetenlabs/truss …

bf95480

…into varun/gcs-cache

varunshenoy changed the title ~~WIP: Bundling Private Weights from GCP~~ Bundling Private Weights from GCP Aug 16, 2023

varunshenoy marked this pull request as ready for review August 16, 2023 23:46

varunshenoy requested a review from aspctu August 16, 2023 23:46

bolasim reviewed Aug 16, 2023

View reviewed changes

Varun Shenoy added 13 commits August 16, 2023 23:59

bump

7adf983

model can now point to a gs bucket

2e2da26

alias model_name for gcs

a4616f8

bump

bb87933

fix cache_requirements

7401ddb

bump

14f7bd7

bump

5585055

bump

7481159

remove cache_reqs

5ee5dae

remove cache_reqs

df33476

bump

efd512a

add cache_reqs

358186a

bump

47dd098

bolasim approved these changes Aug 17, 2023

View reviewed changes

Varun Shenoy added 6 commits August 18, 2023 03:18

update serving builder

29b0b4d

remove spec

599d791

revert toml

c3419a2

bump

acf0c5e

add gc to toml

e8611bc

update lock

46585bb

varunshenoy merged commit 0a52e52 into main Aug 18, 2023

varunshenoy deleted the varun/gcs-cache branch August 18, 2023 05:07

varunshenoy restored the varun/gcs-cache branch August 18, 2023 21:57

bolasim added a commit that referenced this pull request Aug 18, 2023

Revert "Bundling Private Weights from GCP (#552)"

706eb42

This reverts commit 0a52e52.

varunshenoy pushed a commit that referenced this pull request Aug 18, 2023

Revert "Bundling Private Weights from GCP (#552)" (#591)

2975bf7

This reverts commit 0a52e52.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bundling Private Weights from GCP #552

Bundling Private Weights from GCP #552

varunshenoy commented Aug 14, 2023 •

edited

Loading

bolasim Aug 16, 2023

varunshenoy Aug 17, 2023 •

edited

Loading

bolasim Aug 16, 2023

varunshenoy Aug 17, 2023

bolasim Aug 16, 2023

bolasim left a comment

bolasim Aug 17, 2023

bolasim Aug 17, 2023

bolasim Aug 17, 2023

Bundling Private Weights from GCP #552

Bundling Private Weights from GCP #552

Conversation

varunshenoy commented Aug 14, 2023 • edited Loading

bolasim Aug 16, 2023

Choose a reason for hiding this comment

varunshenoy Aug 17, 2023 • edited Loading

Choose a reason for hiding this comment

bolasim Aug 16, 2023

Choose a reason for hiding this comment

varunshenoy Aug 17, 2023

Choose a reason for hiding this comment

bolasim Aug 16, 2023

Choose a reason for hiding this comment

bolasim left a comment

Choose a reason for hiding this comment

bolasim Aug 17, 2023

Choose a reason for hiding this comment

bolasim Aug 17, 2023

Choose a reason for hiding this comment

bolasim Aug 17, 2023

Choose a reason for hiding this comment

varunshenoy commented Aug 14, 2023 •

edited

Loading

varunshenoy Aug 17, 2023 •

edited

Loading