Issue: How to configure Amazon SageMaker endpoint #10184

shenshaoyong · 2023-09-04T14:11:27Z

Issue you'd like to raise.

#1. Open terminal, install bedrock specific version boto3 SDK,langchain
curl -sS https://d2eo22ngex1n9g.cloudfront.net/Documentation/SDK/bedrock-python-sdk.zip > sdk.zip
sudo yum install unzip -y
unzip sdk.zip -d sdk
pip install --no-build-isolation --force-reinstall ./sdk/awscli--py3-none-any.whl ./sdk/boto3--py3-none-any.whl ./sdk/botocore-*-py3-none-any.whl
pip install --quiet langchain==0.0.249
#pip install 'jupyter-ai>=1.0,<2.0' # If you use JupyterLab 3�pip install jupyter-ai # If you use JupyterLab 4
#2. change the default token count to 1024
vi ~/anaconda3/lib/python3.11/site-packages/langchain/llms/sagemaker_endpoint.py
Insert below lines after body = self.content_handler.transform_input(prompt, _model_kwargs)
parameters={"max_new_tokens": 1024, "top_p": 0.9, "temperature": 0.6, "return_full_text": True}
t = json.loads(body)
t["parameters"] = parameters
body = json.dumps(t)
Insert the line CustomAttributes='accept_eula=true’, between Accept=accepts, and **_endpoint_kwargs,
#3. aws configure default profile, make sure the aksk has enough permissions(SageMakerFullAccess)
aws configure
#4.run %%ai in *.ipynb file on ec2 instead of SageMaker notebook instance / SageMaker Studio [also can run in VSCODE] after making sure your Amazon SageMaker endpoint is health
%load_ext jupyter_ai
%%ai sagemaker-endpoint:jumpstart-dft-meta-textgeneration-llama-2-7b --region-name=us-east-1 --request-schema={"inputs":""} --response-path=[0]['generation']
write somthing on Humor

Suggestion:

No response

dosubot · 2023-09-04T14:22:28Z

🤖

Hello,

Thank you for reaching out. It seems like you're trying to configure an Amazon SageMaker endpoint using the LangChain framework. Here's how you can do it:

Install the necessary packages: langchain and boto3 using pip.

!pip3 install langchain boto3

Import the necessary classes and functions.

from typing import Dict
from langchain import SagemakerEndpoint
from langchain.llms.sagemaker_endpoint import LLMContentHandler
from langchain.chains.question_answering import load_qa_chain
import json

Define the ContentHandler class which will handle the transformation of input and output for the language model.

class ContentHandler(LLMContentHandler):
    content_type = "application/json"
    accepts = "application/json"

    def transform_input(self, prompt: str, model_kwargs: Dict) -> bytes:
        input_str = json.dumps({prompt: prompt, **model_kwargs})
        return input_str.encode("utf-8")

    def transform_output(self, output: bytes) -> str:
        response_json = json.loads(output.read().decode("utf-8"))
        return response_json[0]["generated_text"]

Create an instance of SagemakerEndpoint with the necessary parameters.

content_handler = ContentHandler()

chain = load_qa_chain(
    llm=SagemakerEndpoint(
        endpoint_name="endpoint-name",
        credentials_profile_name="credentials-profile-name",
        region_name="us-west-2",
        model_kwargs={"temperature": 1e-10},
        content_handler=content_handler,
    ),
    prompt=PROMPT,
)

In the SagemakerEndpoint call, you need to provide the following parameters:

endpoint_name: The name of the endpoint from the deployed Sagemaker model. Must be unique within an AWS Region.
credentials_profile_name: The name of the profile in the ~/.aws/credentials or ~/.aws/config files, which has either access keys or role information specified. If not specified, the default credential profile or, if on an EC2 instance, credentials from IMDS will be used.
region_name: The AWS region where your SageMaker endpoint is deployed.
model_kwargs: A dictionary of additional arguments to pass to the model.
content_handler: An instance of a class that handles the transformation of input and output for the language model.

Use the chain function to process your documents and question.

chain({"input_documents": docs, "question": query}, return_only_outputs=True)

In this example, docs is a list of Document objects and query is the question you want to ask.

Please note that the SagemakerEndpoint class in the LangChain framework is designed to interact with a deployed Sagemaker model. It is used to authenticate with AWS, send requests to the Sagemaker endpoint, and handle the responses.

I hope this helps! If you have any other questions, feel free to ask.

Best,
Dosu

Sources

This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

If you want to continue the conversation, start your reply with @dosu-bot.

darshisen-wbd · 2023-09-15T02:32:44Z

When I create a boto3 client using credential profile, it works fine:

session = boto3.Session(profile_name='my_llm_profile')

# Create a low-level client representing Amazon SageMaker Runtime
sagemaker_runtime = session.client(
    "sagemaker-runtime", region_name='us-east-1')

# The endpoint name must be unique within 
# an AWS Region in your AWS account. 
endpoint_name='my-llm-ept'

# Gets inference from the model hosted at the specified endpoint:
response = sagemaker_runtime.invoke_endpoint(
    EndpointName=endpoint_name,
    Body=bytes('{"inputs": ["what is the meaning of life?"]}', 'utf-8')
    )

I get the expected output:
{"generated_text": "\nThe meaning of life is to live a life that is meaningful to you.\nWhat is the meaning of life? What is the meaning of"}

But when I use the same profile through langchain Sagemaker Endpoint, it gives me an error:

content_handler = ContentHandler()

llm=SagemakerEndpoint(
        endpoint_name="my-llm-ept",
        credentials_profile_name="my_llm_profile",
        region_name="us-east-1",
        model_kwargs={"temperature": 1e-10},
        content_handler=content_handler,
    )

This is the error I get:

ValueError: Error raised by inference endpoint: An error occurred (AccessDeniedException) when calling the InvokeEndpoint operation: User: arn:aws:sts::<accountid>:assumed-role/AWSReservedSSO_<my role>/<my user> is not authorized to perform: sagemaker:InvokeEndpoint on resource: arn:aws:sagemaker:us-east-1:<accountid>:endpoint/my-llm-ept because no identity-based policy allows the sagemaker:InvokeEndpoint action

…t scenarios (#10728) - **Description: Allow to inject boto3 client for Cross account access type of scenarios in using Sagemaker Endpoint ** - **Issue:#10634 #10184** - **Dependencies: None** - **Tag maintainer:** - **Twitter handle:lethargicoder** Co-authored-by: Vikram(VS) <vssht@amazon.com>

…ntEmbeddings (#12146) **Description: Allow to inject boto3 client for Cross account access type of scenarios in using SagemakerEndpointEmbeddings and also updated the documentation for same in the sample notebook** **Issue:SagemakerEndpointEmbeddings cross account capability #10634 #10184** Dependencies: None Tag maintainer: Twitter handle:lethargicoder Co-authored-by: Vikram(VS) <vssht@amazon.com>

dosubot · 2023-12-18T23:42:00Z

Hi, @shenshaoyong! I'm Dosu, and I'm helping the LangChain team manage their backlog. I'm marking this issue as stale.

It looks like the issue is about configuring an Amazon SageMaker endpoint using the LangChain framework. You provided detailed steps for installation, configuration, and running the endpoint on an EC2 instance. Dosubot also provided a detailed response with code examples, and another user, darshisen-wbd, shared their experience with encountering an error when using the same profile through LangChain's Sagemaker Endpoint.

Could you please confirm if this issue is still relevant to the latest version of the LangChain repository? If it is, please let the LangChain team know by commenting on the issue. Otherwise, feel free to close the issue yourself, or it will be automatically closed in 7 days. Thank you!

…ntEmbeddings (langchain-ai#12146) **Description: Allow to inject boto3 client for Cross account access type of scenarios in using SagemakerEndpointEmbeddings and also updated the documentation for same in the sample notebook** **Issue:SagemakerEndpointEmbeddings cross account capability langchain-ai#10634 langchain-ai#10184** Dependencies: None Tag maintainer: Twitter handle:lethargicoder Co-authored-by: Vikram(VS) <vssht@amazon.com>

dosubot bot added Ɑ: models Related to LLMs or chat model modules 🤖:question A specific question about the codebase, product, project, or how to use a feature labels Sep 4, 2023

vikramshitole mentioned this issue Sep 18, 2023

Sagemaker endpoint capability to inject boto3 client for cross account scenarios #10728

Merged

This was referenced Oct 21, 2023

10634: Added the capability to inject boto3 client in SagemakerEndpointEmbeddings #12106

Closed

10634: Added the capability to inject boto3 client in SagemakerEndpointEmbeddings #12146

Merged

dosubot bot added the stale Issue has not had recent activity or appears to be solved. Stale issues will be automatically closed label Dec 18, 2023

dosubot bot closed this as not planned Won't fix, can't repro, duplicate, stale Dec 25, 2023

dosubot bot removed the stale Issue has not had recent activity or appears to be solved. Stale issues will be automatically closed label Dec 25, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Issue: How to configure Amazon SageMaker endpoint #10184

Issue: How to configure Amazon SageMaker endpoint #10184

shenshaoyong commented Sep 4, 2023

dosubot bot commented Sep 4, 2023

darshisen-wbd commented Sep 15, 2023

dosubot bot commented Dec 18, 2023

Issue: How to configure Amazon SageMaker endpoint #10184

Issue: How to configure Amazon SageMaker endpoint #10184

Comments

shenshaoyong commented Sep 4, 2023

Issue you'd like to raise.

Suggestion:

dosubot bot commented Sep 4, 2023

Sources

darshisen-wbd commented Sep 15, 2023

dosubot bot commented Dec 18, 2023