-
Notifications
You must be signed in to change notification settings - Fork 9.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Enhancement]: one to one mapping with sagemaker jumpstart model creation #35011
Comments
Community NoteVoting for Prioritization
Volunteering to Work on This Issue
|
Hey @Bryson14 👋 Thank you for taking the time to raise this! As a heads up, we consider adding additional arguments to existing resources to be an enhancement, so I've updated the labels with that in mind. |
Hi @Bryson14 , Are you able to provide the ECR image you have used for sagemaker_mistral_public_image. Also, if you can provide a working example either in CLI or anywhere else would great. Most models I have tried do not support managed instance scaling, so its blocking me from writing a test case to enable this feature. I have used example here - https://repost.aws/questions/QUODaQEyKNTbqWLYszAIYCIg/creating-jumpstart-sagemaker-endpoint-with-terraform-fails-with-model-needs-flash-attention
|
Isn't enabling the network isolation done in the SageMaker model and not the endpoint config? |
Yes, when a model is specified in endpoint config VPC/subnet details and network isolation cannot be specified and is mutually.exclusive. The endpoint config inherits VPC config and network isolation from model definition. |
Not sure if it helps, but at least it may help someone who stumbles upon this issue later. To answer the issue mentioned above of "the values are these jumpstart images and s3 locations are not published". I was able to retrieve these programmatically like this: (vs-code jupyter notebook script formatting) # %%
from sagemaker.jumpstart.notebook_utils import list_jumpstart_models
from sagemaker import image_uris, model_uris
# %%
region = "us-west-2" # Your region.
instance_type = "ml.g5.2xlarge" # Your desired instance type. Note image will be different for gpu vs cpu instances.
# %%
# find model_id for a given search string
[m for m in list_jumpstart_models(region=region) if "mistral" in m]
# %%
model_id = "huggingface-llm-mistral-7b-instruct"
# %%
# find latest version of model_id
[m for m in list_jumpstart_models(filter=f"model_id=={model_id}", list_versions=True, region=region)]
# %%
model_version = "3.1.0"
# %%
image_uris.retrieve(framework=None, instance_type=instance_type, image_scope="inference", model_id=model_id, model_version=model_version, region=region)
# %%
model_uris.retrieve(instance_type=instance_type, model_scope="inference", model_id=model_id, model_version=model_version, region=region) |
Warning This issue has been closed, meaning that any additional comments are hard for our team to see. Please assume that the maintainers will not see them. Ongoing conversations amongst community members are welcome, however, the issue will be locked after 30 days. Moving conversations to another venue, such as the AWS Provider forum, is recommended. If you have additional concerns, please open a new issue, referencing this one where needed. |
This functionality has been released in v5.67.0 of the Terraform AWS Provider. Please see the Terraform documentation on provider versioning or reach out if you need any assistance upgrading. For further feature requests or bug reports with this functionality, please create a new GitHub issue following the template. Thank you! |
I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues. |
Terraform Core Version
1.6.5
AWS Provider Version
5.31.0
Affected Resource(s)
Sagemaker Engpoint config.
Expected Behavior
When creating a jumpstart endpoint through the SageMaker studio, you can create a LLM (like mistral) on an managed endpoint. There are few hacks you have to do to get this to work with Terraform because the values are these jumpstart images and s3 locations are not published. But by deploying a model on studio, then using
aws cli
to get the model'sprimary_container.environment
andmodel_data_source
, terraform can copy it.The issue is that the
aws_sagemaker_endpoint_configuration
cannot support the configuration that sagemaker studio creates by default.Here is the described endpoint configuration made by studio:
Actual Behavior
With terraform, it is not possible to specify
ManagedInstanceScaling
:It is also not possible to specify
NetworkIsolation
This is the endpoint configuration created by terraform
Relevant Error/Panic Output Snippet
No response
Terraform Configuration Files
Steps to Reproduce
run standard terraform
init
,plan
, andapply
and check the comparison between the endpoint configurations deployed by terraform and SageMaker studio UI.Debug Output
No response
Panic Output
No response
Important Factoids
No response
References
No response
Would you like to implement a fix?
None
The text was updated successfully, but these errors were encountered: