Eval local HF models with flag, add LLaMA and Alpaca #1505

julian-q · 2023-04-24T22:06:41Z

Addresses #1486 and adds LLaMA and Alpaca

This PR allows you to evaluate any HuggingFace model you have stored on the machine which you are running HELM.

Instructions

You have two options for telling HELM where your model is stored:

specify the path(s) of your local model(s) via the command-line flag --enable-local-huggingface-models <path 1> [<path 2> ...]
place the model (or a symlink to it) in the default directory. Currently, this is ./huggingface_models, which is set by the constant LOCAL_HUGGINGFACE_MODEL_DIR in huggingface_model_registry.py. This only works for the local models that are already listed in models.py since they need to be in ALL_MODELS in order to create RunSpecs. Currently I've listed huggingface/llama-7b, huggingface/alpaca-7b.

Then, you can refer to your model in any run specs as huggingface/<model_name>, where <model_name> either comes from models.py or is the name of the directly containing directory in the path specified with --enable-local-huggingface-models . For example, if I pass --enable-local-huggingface-models /home/quevedo/llama-13b, I can refer to this model in a run spec as huggingface/llama-13b.

Feedback request

Making the command line option optional led to some messy code, Is there any way we can avoid having to recreate the HuggingFaceModelConfig multiple times throughout the code? Maybe I can register the model beforehand, as if the user registered it with the flag. Now we pre-register local models from models.py in the _huggingface_model_registry dict.

src/helm/benchmark/run.py

src/helm/proxy/clients/huggingface_client.py

yifanmai

Looks promising!

src/helm/proxy/retry.py

yifanmai · 2023-05-10T05:17:03Z

src/helm/proxy/clients/huggingface_model_registry.py

@@ -10,6 +10,8 @@
    FULL_FUNCTIONALITY_TEXT_MODEL_TAG,
 )

+# The path where local HuggingFace models should be downloaded or symlinked, e.g. ./helm-models/llama-7b
+LOCAL_MODEL_DIR = "./helm-models"


"./huggingface_models"?

Reasoning:

_ for consistency with prod_env and benchmark_output

huggingface for clarity that this only supports huggingface format

Don't need helm because we are already in a HELM working directory

yifanmai · 2023-05-10T05:18:41Z

src/helm/proxy/models.py

+    Model(
+        group="local",
+        creator_organization="Meta",
+        name="local/llama-7b",


I think these should all be group="huggingface" and name="huggingface/llama-7b" for consistency.

I would like this to be meta/llama-7b but unfortunately the pre-existing code base ties the group to the "client" used to serve the model, which is why we have huggingface/gpt2 instead of openai/gpt2 or together/t5-11b instead of google/t5-11b. I would like to make everything be the correct group eventually, but for now we should be consistent with existing conventions. It's also not obvious to me that "local" means "local Hugging Face" as opposed to "local some other framework".

Also unfortunately these model names are annoying to change because we have to do a MongoDB migration every time we change one of these.

cc @percyliang for opinions about naming

src/helm/proxy/models.py

src/helm/proxy/clients/huggingface_client.py

src/helm/proxy/clients/huggingface_model_registry.py

src/helm/proxy/models.py

DavdGao · 2023-05-10T09:59:35Z

I notice that in huggingfaceserver, all the models are loaded in "cuda:0". Since we will run the model locally (maybe with --num-threads > 1), maybe it's better to load the mode into different gpus?

JosselinSomervilleRoberts

What's the status of this?
I know we said we wanted to move fast on adding new models but maybe I should wait for the changes to be made and the PR to be merged before diving deep into this?
@yifanmai what do you think?

src/helm/benchmark/window_services/window_service_factory.py

JosselinSomervilleRoberts · 2023-05-10T17:57:37Z

src/helm/proxy/clients/auto_client.py

+            elif organization == "local":
+                # HACK since we want to use the huggingface sqlite file. TODO avoid this
+                client = HuggingFaceClient(cache_config=cache_config)


Are we not planning to have other local models than HuggingFace btw?

src/helm/proxy/clients/auto_client.py

src/helm/proxy/clients/huggingface_client.py

JosselinSomervilleRoberts · 2023-05-10T18:09:17Z

src/helm/proxy/clients/huggingface_model_registry.py

@@ -28,6 +30,9 @@ class HuggingFaceModelConfig:

    If None, use the default revision."""

+    path: Optional[str] = None
+    """Local path to the Hugging Face model weights"""


Maybe add the default path here: ./huggingface_models

Thanks! I added a comment explaining how this parameter is set and what it means:)

I don't set it directly to the local path by default (like path: str = ./huggingface_models) because None means to load it from the hub instead.

src/helm/proxy/clients/huggingface_model_registry.py

JosselinSomervilleRoberts · 2023-05-10T18:12:47Z

src/helm/proxy/clients/huggingface_model_registry.py

+        model_name = match.group("model_name")
+        assert model_name
+        return HuggingFaceModelConfig(
+            namespace="local",


Why don't we put something more explicit? I personally feel like "None" is not super clear. Could we maybe put something like "internal"? "local-model"?

src/helm/proxy/retry.py

JosselinSomervilleRoberts

What's the status of this?
I know we said we wanted to move fast on adding new models but maybe I should wait for the changes to be made and the PR to be merged before diving deep into this?
@yifanmai what do you think?

yifanmai · 2023-06-13T16:57:51Z

src/helm/proxy/clients/huggingface_client.py

+
+
+def _get_singleton_server(model_config: HuggingFaceModelConfig) -> HuggingFaceServer:
+    global _servers_lock


nit: docstring should go after the method definition. See PEP 257

def _get_singleton_server(model_config: HuggingFaceModelConfig) -> HuggingFaceServer: """Lookup or create a new HuggingFaceServer that will be shared among all threads. When --num-threads > 1, multiple threads will attempt to instantiate etc."""

yifanmai · 2023-06-13T21:49:34Z

src/helm/proxy/clients/huggingface_client.py

                )
            else:
                raise Exception(f"Unknown HuggingFace model: {model}")
-
-        return self.model_server_instances[model]
+        else:


The else is still here...

julian-q · 2023-07-03T15:32:14Z

Thank you for helping get this merged @yifanmai !

patrickc3000 · 2023-07-20T20:26:54Z

Is --enable-local-huggingface-models still available in the latest version of Helm?

yifanmai · 2023-07-24T17:55:05Z

Hi, it will be included in v0.2.3, which should be released later this week.

add flag for local model

1151013

percyliang reviewed Apr 25, 2023

View reviewed changes

src/helm/benchmark/run.py Outdated Show resolved Hide resolved

percyliang reviewed Apr 25, 2023

View reviewed changes

src/helm/proxy/clients/huggingface_client.py Outdated Show resolved Hide resolved

Julian Quevedo added 2 commits April 25, 2023 15:41

egregious print statement

30e858d

revert default num threads

7ea8b24

yifanmai mentioned this pull request Apr 26, 2023

Add LLaMA / Alpaca models #1402

Closed

DavdGao added a commit to DavdGao/helm that referenced this pull request Apr 27, 2023

enable loading local model, adapted from stanford-crfm#1505

b71542a

julian-q added 10 commits May 7, 2023 17:15

fully decode token ids

8b1b4e8

max token cap

f77ecc9

locked server creation

6207caf

fix threading

5558d76

add error message for assert

74cbfeb

max_new_tokens should already be configured by huggingface!

59a70a2

singleton huggingface server map

d14a219

simplify roundtrip decoding

2fff570

clarify comments

359ff57

special-case the llama models (this makes the flag optional)

3357981

julian-q marked this pull request as ready for review May 9, 2023 18:57

remove assert due to EOS, BOS tokens

04a2997

julian-q requested review from yifanmai and JosselinSomervilleRoberts May 9, 2023 23:26

add warning

bffa1cf

yifanmai previously requested changes May 10, 2023

View reviewed changes

JosselinSomervilleRoberts reviewed May 10, 2023

View reviewed changes

JosselinSomervilleRoberts mentioned this pull request May 10, 2023

Custom model integration #1529

Closed

julian-q added 3 commits May 10, 2023 16:23

revert back to default retry

f2df967

make sure referencing global lock

c40cf6f

rename default dir

c1a81cf

julian-q added 10 commits June 12, 2023 13:56

remove leading ./

b8e6658

remove unnecessary path field

e468e1d

use os.path.split to get model name

affd0c8

simplify model registry init

b9d863b

fix error string to use config.model_id

6e6c38b

remove unnecessary else

a984082

remove early instantiation of huggingface server

8802ca4

add explanation for singleton

f17785a

fix model config class imports

cae1fe6

newline

1c59e80

yifanmai approved these changes Jun 13, 2023

View reviewed changes

This was referenced Jun 15, 2023

Contributing/Adding new model #1542

Closed

Configuration file for adding models to HELM #1673

Closed

Add fsspec support for loading Hugging Face models #1674

Closed

yifanmai added 4 commits June 22, 2023 16:16

Merge branch 'main' into local-hf-models

89ad213

Fixes

4b50333

Fix tests

0cad20b

Fix docstring

ba0753e

yifanmai merged commit 3cd8573 into main Jun 30, 2023

yifanmai deleted the local-hf-models branch June 30, 2023 17:15

yifanmai mentioned this pull request Jul 5, 2023

Run evaluation against an isolated local model API #1707

Closed

linguo4 mentioned this pull request Jul 25, 2023

Requests failed when using --enable-local-huggingface-models #1742

Closed

yifanmai mentioned this pull request Aug 21, 2023

What is a good local workflow #1794

Closed

yifanmai mentioned this pull request Sep 27, 2023

Question: How to use Stanford HELM for local model setup? #1858

Open

dlwh mentioned this pull request Nov 27, 2023

"--enable-local-huggingface-models" also seems broken #2057

Closed

This was referenced May 28, 2024

Add Test for Optimum Intel #2674

Merged

Support for Hugging Face Local Models #2690

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Eval local HF models with flag, add LLaMA and Alpaca #1505

Eval local HF models with flag, add LLaMA and Alpaca #1505

julian-q commented Apr 24, 2023 •

edited

Loading

yifanmai left a comment

yifanmai May 10, 2023

yifanmai May 10, 2023

DavdGao commented May 10, 2023

JosselinSomervilleRoberts left a comment

JosselinSomervilleRoberts May 10, 2023

JosselinSomervilleRoberts May 10, 2023

julian-q May 11, 2023

julian-q May 11, 2023

JosselinSomervilleRoberts May 10, 2023

JosselinSomervilleRoberts left a comment

yifanmai Jun 13, 2023

yifanmai Jun 13, 2023

julian-q commented Jul 3, 2023

patrickc3000 commented Jul 20, 2023

yifanmai commented Jul 24, 2023



		def _get_singleton_server(model_config: HuggingFaceModelConfig) -> HuggingFaceServer:
		global _servers_lock

Eval local HF models with flag, add LLaMA and Alpaca #1505

Eval local HF models with flag, add LLaMA and Alpaca #1505

Conversation

julian-q commented Apr 24, 2023 • edited Loading

Instructions

Feedback request

yifanmai left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

DavdGao commented May 10, 2023

JosselinSomervilleRoberts left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

JosselinSomervilleRoberts left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

julian-q commented Jul 3, 2023

patrickc3000 commented Jul 20, 2023

yifanmai commented Jul 24, 2023

julian-q commented Apr 24, 2023 •

edited

Loading