Update README.md

Co-authored-by: Jingya HUANG <44135271+JingyaHuang@users.noreply.github.com>
aws · May 8, 2024 · 8c9fc10 · 8c9fc10
1 parent a406f81
commit 8c9fc10
Showing 1 changed file with 1 addition and 1 deletion.
diff --git a/README.md b/README.md
@@ -160,7 +160,7 @@ The custom module can override the following methods:
 ## 🏎️ Deploy Models on AWS Inferentia2
 
 The SageMaker Hugging Face Inference Toolkit provides support for deploying Hugging Face on AWS Inferentia2. To deploy a model on Inferentia2 you have 3 options:
-* Provide an already compiled model with a `model.neuron` file as `HF_MODEL_ID`, .e.g. `optimum/tiny_random_bert_neuron`
+* Provide `HF_MODEL_ID`, the model repo id on huggingface.co which contains the compiled model under `.neuron` format. e.g. `optimum/bge-base-en-v1.5-neuronx`
 * Provide the `HF_OPTIMUM_BATCH_SIZE` and `HF_OPTIMUM_SEQUENCE_LENGTH` environment variables to compile the model on the fly, e.g. `HF_OPTIMUM_BATCH_SIZE=1 HF_OPTIMUM_SEQUENCE_LENGTH=128`
 * Include `neuron` dictionary in the [config.json](https://huggingface.co/optimum/tiny_random_bert_neuron/blob/main/config.json) file in the model archive, e.g. `neuron: {"static_batch_size": 1, "static_sequence_length": 128}`