-
Notifications
You must be signed in to change notification settings - Fork 275
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Configurable models for NeurIPS Efficiency Challenge #1861
Conversation
cc @drisspg @msaroufim For NeurIPS Efficiency Challenge. cc @aniketmaurya This will eventually allow the user to configure parameters for the Lit-GPT client directly. |
client_spec: ClientSpec | ||
"""Specification for instantiating the client for this model deployment.""" | ||
|
||
max_sequence_length: Optional[int] | ||
"""Maximum equence length for this model deployment.""" | ||
model_name: Optional[str] = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this moved down?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The ordering should be that all the required parameters come first, then all the optional parameters with default arguments.
client_spec: ClientSpec | ||
"""Specification for instantiating the client for this model deployment.""" | ||
|
||
max_sequence_length: Optional[int] | ||
"""Maximum equence length for this model deployment.""" | ||
model_name: Optional[str] = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm wondering if we should put an example in the docstring so people have a sense of what the difference between name and model_name is, etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Going to defer this until we actually implement the multi-deployments feature, which isn't on the roadmap yet.
|
||
|
||
def maybe_register_model_metadata_from_base_path(base_path: str) -> None: | ||
path = os.path.join(base_path, MODEL_METADATA_FILE) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add docstring?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added docstring.
try: | ||
model = get_model(run_spec.adapter_spec.model) | ||
except ValueError: | ||
# Models registered from configs cannot have expanders applied to them, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ValueError means that the model has not been loaded yet? I was a bit confused by the comment at first, maybe connect the dots a bit more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It means the model has not been registered yet. I'll add more docs.
|
||
@dataclass(frozen=True) | ||
class TokenizerConfigs: | ||
tokenizers: List[TokenizerConfig] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should this field be tokenizers
or tokenizer_configs
?
from helm.common.object_spec import ObjectSpec | ||
|
||
|
||
TOKENIEZR_CONFIGS_FILE = "tokenizer_configs.yaml" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this file be tokenizer_configs.yaml
or tokenizers.yaml
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Making this tokenizer_configs.yaml
.
name: str | ||
"""Name of the tokenizer.""" | ||
|
||
tokenizer_spec: TokenizerSpec |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is tokenizer_spec
instead of client_spec
because I think that eventually Tokenizers and Clients should be separate classes...
find one with a key matching the missing parameter's name. | ||
If found in constant_bindings, add the corresponding value to args. | ||
If found in provider_bindings, call the corresponding value and add the return values to args. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you provide an example or two of usage?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added example.
) | ||
return deployment_api_keys[model] | ||
|
||
client_spec = inject_object_spec_args( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you write some comments on why this injection is needed? My initial impression is that it seems a bit complicated / fancy...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a passage.
Dependency injection is needed here for these reasons:
- Different clients have different parameters. Dependency injection provides arguments that match the parameters of the client.
- Some arguments, such as the tokenizer, are not static data objects that can be in the users configuration file. Instead, they have to be constructed dynamically at runtime.
- The providers must be lazily-evaluated, because eager evaluation can result in an exception. For instance, some clients do not require an API key, so trying to fetch the API key from configuration eagerly will result in an exception because the user will not have configured an API key.
@msaroufim @drisspg I will merge this soon to unblock other work, but feel free to leave post-merge comments and I'll make requested changes in a follow-up PR. |
Hey, that makes sense. Lets say we wanted all models to use the http tokenizer. I am don't think we have had to setup
Does the workflow change from |
I thought the description of this PR was really useful, should we not add this in a README somewhere so that it's easier for people to add their own model? |
@drisspg The current documented workflow does not change. Basically, if people were using This new integration provides a new workflow with the following benefits:
|
@JosselinSomervilleRoberts I'll move most of this to documentation when this is a little more baked and less experimental. I think that some of this API might still be subject to change. |
This supports running a model for NeurIPS Efficiency Challenge with a user-configurable model name, HTTP service URL, window service and tokenizer.
For models that use a built-in
WindowService
:prod_env/model_deployments.yaml
:The following models are supported, with corresponding
class_name
insidewindow_service_spec
:helm.benchmark.window_services.llama_window_service.LlamaWindowService
helm.benchmark.window_services.llama_window_service.Llama2WindowService
helm.benchmark.window_services.gptneox_window_service.GPTNeoXWindowService
helm.benchmark.window_services.gptneox_window_service.GPTNeoXWindowService
helm.benchmark.window_services.opt_window_service.OPTWindowService
helm.benchmark.window_services.bloom_window_service.BloomWindowService
helm.benchmark.window_services.gptneox_window_service.GPTNeoXWindowService
helm.benchmark.window_services.gpt2_window_service.GPT2WindowService
helm.benchmark.window_services.t511b_window_service.T511bWindowService
helm.benchmark.window_services.ul2_window_serviceUL2WindowService
For models that use Hugging Face
AutoTokenizer
:prod_env/model_deployments.yaml
:Change
tokenizer_name
andsequence_length
accordingly. Refer to the Hugging Face model card for the correct values.If
sequence_length
is not set, it will be auto-inferred from Hugging Face Hub'sAutoTokenizer
, which can result in incorrect values because many Hugging Face HubAutoTokenizer
s have incorrect metadata.The following models are supported, with corresponding
tokenizer_name
:tiiuae/falcon-7b
For any models not listed, you can fall back to the HTTP service tokenizer:
prod_env/model_deployments.yaml
:prod_env/tokenizer_configs.yaml
: