Configurable models for NeurIPS Efficiency Challenge #1861

yifanmai · 2023-09-26T05:10:29Z

This supports running a model for NeurIPS Efficiency Challenge with a user-configurable model name, HTTP service URL, window service and tokenizer.

For models that use a built-in WindowService:

prod_env/model_deployments.yaml:

model_deployments:
  - name: neurips/my-pythia-model
    window_service_spec:
      class_name: "helm.benchmark.window_services.gptneox_window_service.GPTNeoXWindowService"
      args: {}
    client_spec:
      class_name: "helm.proxy.clients.http_model_client.HTTPModelClient"
      args: {
        base_url: "http://localhost:2345"
      }

The following models are supported, with corresponding class_name inside window_service_spec:

LLaMA: helm.benchmark.window_services.llama_window_service.LlamaWindowService
Llama 2: helm.benchmark.window_services.llama_window_service.Llama2WindowService
Red Pajama Base (not instruction tuned models): helm.benchmark.window_services.gptneox_window_service.GPTNeoXWindowService
MPT: helm.benchmark.window_services.gptneox_window_service.GPTNeoXWindowService
OPT: helm.benchmark.window_services.opt_window_service.OPTWindowService
Bloom: helm.benchmark.window_services.bloom_window_service.BloomWindowService
GPT Neo, J, NeoX, Pythia: helm.benchmark.window_services.gptneox_window_service.GPTNeoXWindowService
GPT2: helm.benchmark.window_services.gpt2_window_service.GPT2WindowService
T5 (not Flan-T5): helm.benchmark.window_services.t511b_window_service.T511bWindowService
UL2: helm.benchmark.window_services.ul2_window_serviceUL2WindowService

For models that use Hugging Face AutoTokenizer:

prod_env/model_deployments.yaml:

model_deployments:
  - name: neurips/my-falcon-7b-model
    tokenizer_name: "tiiuae/falcon-7b"
    sequence_length: 2048
    window_service_spec:
      class_name: "helm.benchmark.window_services.huggingface_window_service.HuggingFaceWindowService"
      args: {}
    client_spec:
      class_name: "helm.proxy.clients.http_model_client.HTTPModelClient"
      args: {
        base_url: "http://localhost:2345"
      }

Change tokenizer_name and sequence_length accordingly. Refer to the Hugging Face model card for the correct values.

If sequence_length is not set, it will be auto-inferred from Hugging Face Hub's AutoTokenizer, which can result in incorrect values because many Hugging Face Hub AutoTokenizers have incorrect metadata.

The following models are supported, with corresponding tokenizer_name:

Falcon: tiiuae/falcon-7b

For any models not listed, you can fall back to the HTTP service tokenizer:

prod_env/model_deployments.yaml:

model_deployments:
  - name: neurips/my-model
    tokenizer_name: neurips/my-tokenizer
    max_sequence_length: 2048
    client_spec:
      class_name: "helm.proxy.clients.http_model_client.HTTPModelClient"
      args: {
        base_url: "http://localhost:2345"
      }

prod_env/tokenizer_configs.yaml:

tokenizer_configs:
  - name: neurips/my-tokenizer
    tokenizer_spec:
      class_name: "helm.proxy.clients.http_model_client.HTTPModelClient"
      args: {
        base_url: "http://localhost:1234"
      }

yifanmai · 2023-09-26T05:12:51Z

cc @drisspg @msaroufim For NeurIPS Efficiency Challenge.

cc @aniketmaurya This will eventually allow the user to configure parameters for the Lit-GPT client directly.

percyliang · 2023-09-26T05:13:37Z

src/helm/benchmark/model_deployment_registry.py

    client_spec: ClientSpec
    """Specification for instantiating the client for this model deployment."""

-    max_sequence_length: Optional[int]
-    """Maximum equence length for this model deployment."""
+    model_name: Optional[str] = None


Why is this moved down?

The ordering should be that all the required parameters come first, then all the optional parameters with default arguments.

percyliang · 2023-09-26T05:14:08Z

src/helm/benchmark/model_deployment_registry.py

    client_spec: ClientSpec
    """Specification for instantiating the client for this model deployment."""

-    max_sequence_length: Optional[int]
-    """Maximum equence length for this model deployment."""
+    model_name: Optional[str] = None


I'm wondering if we should put an example in the docstring so people have a sense of what the difference between name and model_name is, etc.

Going to defer this until we actually implement the multi-deployments feature, which isn't on the roadmap yet.

percyliang · 2023-09-26T05:14:38Z

src/helm/benchmark/model_metadata_registry.py

+
+
+def maybe_register_model_metadata_from_base_path(base_path: str) -> None:
+    path = os.path.join(base_path, MODEL_METADATA_FILE)


Add docstring?

Added docstring.

percyliang · 2023-09-26T05:15:47Z

src/helm/benchmark/run_specs.py

+        try:
+            model = get_model(run_spec.adapter_spec.model)
+        except ValueError:
+            # Models registered from configs cannot have expanders applied to them,


ValueError means that the model has not been loaded yet? I was a bit confused by the comment at first, maybe connect the dots a bit more.

It means the model has not been registered yet. I'll add more docs.

yifanmai · 2023-09-26T05:15:18Z

src/helm/benchmark/tokenizer_config_registry.py

+
+@dataclass(frozen=True)
+class TokenizerConfigs:
+    tokenizers: List[TokenizerConfig]


should this field be tokenizers or tokenizer_configs?

yifanmai · 2023-09-26T05:15:33Z

src/helm/benchmark/tokenizer_config_registry.py

+from helm.common.object_spec import ObjectSpec
+
+
+TOKENIEZR_CONFIGS_FILE = "tokenizer_configs.yaml"


Should this file be tokenizer_configs.yaml or tokenizers.yaml?

Making this tokenizer_configs.yaml.

yifanmai · 2023-09-26T05:16:24Z

src/helm/benchmark/tokenizer_config_registry.py

+    name: str
+    """Name of the tokenizer."""
+
+    tokenizer_spec: TokenizerSpec


This is tokenizer_spec instead of client_spec because I think that eventually Tokenizers and Clients should be separate classes...

percyliang · 2023-09-26T05:17:26Z

src/helm/common/object_spec.py

+    find one with a key matching the missing parameter's name.
+    If found in constant_bindings, add the corresponding value to args.
+    If found in provider_bindings, call the corresponding value and add the return values to args.
+


Could you provide an example or two of usage?

Added example.

percyliang · 2023-09-26T05:21:05Z

src/helm/proxy/clients/auto_client.py

+                        )
+                    return deployment_api_keys[model]
+
+                client_spec = inject_object_spec_args(


Can you write some comments on why this injection is needed? My initial impression is that it seems a bit complicated / fancy...

Added a passage.

Dependency injection is needed here for these reasons:

Different clients have different parameters. Dependency injection provides arguments that match the parameters of the client.

Some arguments, such as the tokenizer, are not static data objects that can be in the users configuration file. Instead, they have to be constructed dynamically at runtime.

The providers must be lazily-evaluated, because eager evaluation can result in an exception. For instance, some clients do not require an API key, so trying to fetch the API key from configuration eagerly will result in an exception because the user will not have configured an API key.

yifanmai · 2023-09-28T01:34:43Z

@msaroufim @drisspg I will merge this soon to unblock other work, but feel free to leave post-merge comments and I'll make requested changes in a follow-up PR.

drisspg · 2023-09-28T01:47:14Z

@msaroufim @drisspg I will merge this soon to unblock other work, but feel free to leave post-merge comments and I'll make requested changes in a follow-up PR.

Hey, that makes sense. Lets say we wanted all models to use the http tokenizer. I am don't think we have had to setup

prod_env/model_deployments.yaml
or
prod_env/tokenizer_configs.yaml

Does the workflow change from start llm service on local port -> helm-run with some givein run_spec.conf?

JosselinSomervilleRoberts · 2023-10-02T23:21:35Z

I thought the description of this PR was really useful, should we not add this in a README somewhere so that it's easier for people to add their own model?

yifanmai · 2023-10-03T21:25:16Z

@drisspg The current documented workflow does not change. Basically, if people were using neurips/local before, they can continue to do so.

This new integration provides a new workflow with the following benefits:

The model name can be set by the users individually, so each submission can have a different model name rather than having them all be neurips/local.
This allows running with a different URL or port, which allows the model to be on a different machine as the HELM machine.
This allows using local tokenizers, as opposed to making a HTTP call for each tokenizer, which should provide some speedup.

yifanmai · 2023-10-03T21:26:30Z

@JosselinSomervilleRoberts I'll move most of this to documentation when this is a little more baked and less experimental. I think that some of this API might still be subject to change.

yifanmai added 3 commits September 25, 2023 21:18

Support configuring NeurIPS Efficiency Challenge models

8fd3433

More fixes

21146fa

Minor cleanup

039524d

yifanmai requested review from percyliang and JosselinSomervilleRoberts September 26, 2023 05:10

percyliang reviewed Sep 26, 2023

View reviewed changes

Add tokenizer_config_registry

f644944

percyliang reviewed Sep 26, 2023

View reviewed changes

yifanmai commented Sep 26, 2023

View reviewed changes

percyliang reviewed Sep 26, 2023

View reviewed changes

percyliang approved these changes Sep 26, 2023

View reviewed changes

yifanmai requested review from msaroufim and drisspg September 26, 2023 05:25

yifanmai added 2 commits September 25, 2023 22:28

tokenizers -> tokenizer_configs field name

a97a362

Add other missing file

d397bc3

This was referenced Sep 26, 2023

Question: How to use Stanford HELM for local model setup? #1858

Open

What is a good local workflow #1794

Closed

Dependency injection for model deployments #1787

Closed

yifanmai mentioned this pull request Oct 2, 2023

adding sagemaker support to stanford-crfm/helm #1869

Closed

Review fixes

a1dafb2

yifanmai merged commit 10a27a6 into main Oct 3, 2023

yifanmai deleted the yifanmai/fix-neurips-config branch October 3, 2023 21:26

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Configurable models for NeurIPS Efficiency Challenge #1861

Configurable models for NeurIPS Efficiency Challenge #1861

yifanmai commented Sep 26, 2023 •

edited

Loading

yifanmai commented Sep 26, 2023 •

edited

Loading

percyliang Sep 26, 2023

yifanmai Sep 26, 2023

percyliang Sep 26, 2023

yifanmai Oct 3, 2023

percyliang Sep 26, 2023

yifanmai Oct 3, 2023

percyliang Sep 26, 2023

yifanmai Sep 26, 2023

yifanmai Sep 26, 2023

yifanmai Sep 26, 2023

yifanmai Oct 3, 2023

yifanmai Sep 26, 2023

percyliang Sep 26, 2023

yifanmai Oct 3, 2023

percyliang Sep 26, 2023

yifanmai Oct 3, 2023

yifanmai commented Sep 28, 2023

drisspg commented Sep 28, 2023 •

edited

Loading

JosselinSomervilleRoberts commented Oct 2, 2023

yifanmai commented Oct 3, 2023 •

edited

Loading

yifanmai commented Oct 3, 2023



		def maybe_register_model_metadata_from_base_path(base_path: str) -> None:
		path = os.path.join(base_path, MODEL_METADATA_FILE)

		from helm.common.object_spec import ObjectSpec


		TOKENIEZR_CONFIGS_FILE = "tokenizer_configs.yaml"

Configurable models for NeurIPS Efficiency Challenge #1861

Configurable models for NeurIPS Efficiency Challenge #1861

Conversation

yifanmai commented Sep 26, 2023 • edited Loading

yifanmai commented Sep 26, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

yifanmai commented Sep 28, 2023

drisspg commented Sep 28, 2023 • edited Loading

JosselinSomervilleRoberts commented Oct 2, 2023

yifanmai commented Oct 3, 2023 • edited Loading

yifanmai commented Oct 3, 2023

yifanmai commented Sep 26, 2023 •

edited

Loading

yifanmai commented Sep 26, 2023 •

edited

Loading

drisspg commented Sep 28, 2023 •

edited

Loading

yifanmai commented Oct 3, 2023 •

edited

Loading