You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm experimenting with the Dolly model and I'm trying to deploy it in SageMaker. It all works fine but I'm struggling to run inference—there's something going on with the data format I'm passing, but cannot figure out what!
import json
import boto3
import sagemaker
from sagemaker.huggingface import HuggingFaceModel
# %% Deploy new model
role = sagemaker.get_execution_role()
hub = {"HF_MODEL_ID": "databricks/dolly-v2-12b", "HF_TASK": "text-generation"}
# Create Hugging Face Model Class
huggingface_model = HuggingFaceModel(
transformers_version="4.17.0",
pytorch_version="1.10.2",
py_version="py38",
env=hub,
role=role,
)
# Deploy model to SageMaker Inference
predictor = huggingface_model.deploy(
initial_instance_count=1, # number of instances
instance_type="ml.m5.xlarge", # ec2 instance type
)
predictor.predict({"inputs": "Once upon a time there "})
results in:
ModelError: An error occurred (ModelError) when calling the InvokeEndpoint operation: Received client error (400) from primary with message "{
"code": 400,
"type": "InternalServerException",
"message": "\u0027gpt_neox\u0027"
}
I've tried using json strings but no luck either.
Any help appreciated!
Cheers.
The text was updated successfully, but these errors were encountered:
Hi there,
I'm experimenting with the Dolly model and I'm trying to deploy it in SageMaker. It all works fine but I'm struggling to run inference—there's something going on with the data format I'm passing, but cannot figure out what!
results in:
I've tried using json strings but no luck either.
Any help appreciated!
Cheers.
The text was updated successfully, but these errors were encountered: