Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue: How to configure Amazon SageMaker endpoint #10184

Closed
shenshaoyong opened this issue Sep 4, 2023 · 3 comments
Closed

Issue: How to configure Amazon SageMaker endpoint #10184

shenshaoyong opened this issue Sep 4, 2023 · 3 comments
Labels
Ɑ: models Related to LLMs or chat model modules 🤖:question A specific question about the codebase, product, project, or how to use a feature

Comments

@shenshaoyong
Copy link

Issue you'd like to raise.

#1. Open terminal, install bedrock specific version boto3 SDK,langchain
curl -sS https://d2eo22ngex1n9g.cloudfront.net/Documentation/SDK/bedrock-python-sdk.zip > sdk.zip
sudo yum install unzip -y
unzip sdk.zip -d sdk
pip install --no-build-isolation --force-reinstall ./sdk/awscli--py3-none-any.whl ./sdk/boto3--py3-none-any.whl ./sdk/botocore-*-py3-none-any.whl
pip install --quiet langchain==0.0.249
#pip install 'jupyter-ai>=1.0,<2.0' # If you use JupyterLab 3�pip install jupyter-ai # If you use JupyterLab 4
#2. change the default token count to 1024
vi ~/anaconda3/lib/python3.11/site-packages/langchain/llms/sagemaker_endpoint.py
Insert below lines after body = self.content_handler.transform_input(prompt, _model_kwargs)
parameters={"max_new_tokens": 1024, "top_p": 0.9, "temperature": 0.6, "return_full_text": True}
t = json.loads(body)
t["parameters"] = parameters
body = json.dumps(t)
Insert the line CustomAttributes='accept_eula=true’, between Accept=accepts, and **_endpoint_kwargs,
#3. aws configure default profile, make sure the aksk has enough permissions(SageMakerFullAccess)
aws configure
#4.run %%ai in *.ipynb file on ec2 instead of SageMaker notebook instance / SageMaker Studio [also can run in VSCODE] after making sure your Amazon SageMaker endpoint is health
%load_ext jupyter_ai
%%ai sagemaker-endpoint:jumpstart-dft-meta-textgeneration-llama-2-7b --region-name=us-east-1 --request-schema={"inputs":""} --response-path=[0]['generation']
write somthing on Humor

Suggestion:

No response

@dosubot dosubot bot added Ɑ: models Related to LLMs or chat model modules 🤖:question A specific question about the codebase, product, project, or how to use a feature labels Sep 4, 2023
@dosubot
Copy link

dosubot bot commented Sep 4, 2023

🤖

Hello,

Thank you for reaching out. It seems like you're trying to configure an Amazon SageMaker endpoint using the LangChain framework. Here's how you can do it:

  1. Install the necessary packages: langchain and boto3 using pip.
!pip3 install langchain boto3
  1. Import the necessary classes and functions.
from typing import Dict
from langchain import SagemakerEndpoint
from langchain.llms.sagemaker_endpoint import LLMContentHandler
from langchain.chains.question_answering import load_qa_chain
import json
  1. Define the ContentHandler class which will handle the transformation of input and output for the language model.
class ContentHandler(LLMContentHandler):
    content_type = "application/json"
    accepts = "application/json"

    def transform_input(self, prompt: str, model_kwargs: Dict) -> bytes:
        input_str = json.dumps({prompt: prompt, **model_kwargs})
        return input_str.encode("utf-8")

    def transform_output(self, output: bytes) -> str:
        response_json = json.loads(output.read().decode("utf-8"))
        return response_json[0]["generated_text"]
  1. Create an instance of SagemakerEndpoint with the necessary parameters.
content_handler = ContentHandler()

chain = load_qa_chain(
    llm=SagemakerEndpoint(
        endpoint_name="endpoint-name",
        credentials_profile_name="credentials-profile-name",
        region_name="us-west-2",
        model_kwargs={"temperature": 1e-10},
        content_handler=content_handler,
    ),
    prompt=PROMPT,
)

In the SagemakerEndpoint call, you need to provide the following parameters:

  • endpoint_name: The name of the endpoint from the deployed Sagemaker model. Must be unique within an AWS Region.
  • credentials_profile_name: The name of the profile in the ~/.aws/credentials or ~/.aws/config files, which has either access keys or role information specified. If not specified, the default credential profile or, if on an EC2 instance, credentials from IMDS will be used.
  • region_name: The AWS region where your SageMaker endpoint is deployed.
  • model_kwargs: A dictionary of additional arguments to pass to the model.
  • content_handler: An instance of a class that handles the transformation of input and output for the language model.
  1. Use the chain function to process your documents and question.
chain({"input_documents": docs, "question": query}, return_only_outputs=True)

In this example, docs is a list of Document objects and query is the question you want to ask.

Please note that the SagemakerEndpoint class in the LangChain framework is designed to interact with a deployed Sagemaker model. It is used to authenticate with AWS, send requests to the Sagemaker endpoint, and handle the responses.

I hope this helps! If you have any other questions, feel free to ask.

Best,
Dosu

Sources


This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

If you want to continue the conversation, start your reply with @dosu-bot.

@darshisen-wbd
Copy link

When I create a boto3 client using credential profile, it works fine:

session = boto3.Session(profile_name='my_llm_profile')

# Create a low-level client representing Amazon SageMaker Runtime
sagemaker_runtime = session.client(
    "sagemaker-runtime", region_name='us-east-1')

# The endpoint name must be unique within 
# an AWS Region in your AWS account. 
endpoint_name='my-llm-ept'

# Gets inference from the model hosted at the specified endpoint:
response = sagemaker_runtime.invoke_endpoint(
    EndpointName=endpoint_name,
    Body=bytes('{"inputs": ["what is the meaning of life?"]}', 'utf-8')
    )

I get the expected output:
{"generated_text": "\nThe meaning of life is to live a life that is meaningful to you.\nWhat is the meaning of life? What is the meaning of"}

But when I use the same profile through langchain Sagemaker Endpoint, it gives me an error:

content_handler = ContentHandler()

llm=SagemakerEndpoint(
        endpoint_name="my-llm-ept",
        credentials_profile_name="my_llm_profile",
        region_name="us-east-1",
        model_kwargs={"temperature": 1e-10},
        content_handler=content_handler,
    )

This is the error I get:

ValueError: Error raised by inference endpoint: An error occurred (AccessDeniedException) when calling the InvokeEndpoint operation: User: arn:aws:sts::<accountid>:assumed-role/AWSReservedSSO_<my role>/<my user> is not authorized to perform: sagemaker:InvokeEndpoint on resource: arn:aws:sagemaker:us-east-1:<accountid>:endpoint/my-llm-ept because no identity-based policy allows the sagemaker:InvokeEndpoint action

baskaryan pushed a commit that referenced this issue Sep 19, 2023
…t scenarios (#10728)

- **Description: Allow to inject boto3 client for Cross account access
type of scenarios in using Sagemaker Endpoint **
  - **Issue:#10634 #10184** 
  - **Dependencies: None** 
  - **Tag maintainer:** 
  - **Twitter handle:lethargicoder**

Co-authored-by: Vikram(VS) <vssht@amazon.com>
baskaryan pushed a commit that referenced this issue Oct 23, 2023
…ntEmbeddings (#12146)

**Description: Allow to inject boto3 client for Cross account access
type of scenarios in using SagemakerEndpointEmbeddings and also updated
the documentation for same in the sample notebook**

**Issue:SagemakerEndpointEmbeddings cross account capability #10634
#10184**

Dependencies: None
Tag maintainer:
Twitter handle:lethargicoder

Co-authored-by: Vikram(VS) <vssht@amazon.com>
Copy link

dosubot bot commented Dec 18, 2023

Hi, @shenshaoyong! I'm Dosu, and I'm helping the LangChain team manage their backlog. I'm marking this issue as stale.

It looks like the issue is about configuring an Amazon SageMaker endpoint using the LangChain framework. You provided detailed steps for installation, configuration, and running the endpoint on an EC2 instance. Dosubot also provided a detailed response with code examples, and another user, darshisen-wbd, shared their experience with encountering an error when using the same profile through LangChain's Sagemaker Endpoint.

Could you please confirm if this issue is still relevant to the latest version of the LangChain repository? If it is, please let the LangChain team know by commenting on the issue. Otherwise, feel free to close the issue yourself, or it will be automatically closed in 7 days. Thank you!

@dosubot dosubot bot added the stale Issue has not had recent activity or appears to be solved. Stale issues will be automatically closed label Dec 18, 2023
@dosubot dosubot bot closed this as not planned Won't fix, can't repro, duplicate, stale Dec 25, 2023
@dosubot dosubot bot removed the stale Issue has not had recent activity or appears to be solved. Stale issues will be automatically closed label Dec 25, 2023
hoanq1811 pushed a commit to hoanq1811/langchain that referenced this issue Feb 2, 2024
…ntEmbeddings (langchain-ai#12146)

**Description: Allow to inject boto3 client for Cross account access
type of scenarios in using SagemakerEndpointEmbeddings and also updated
the documentation for same in the sample notebook**

**Issue:SagemakerEndpointEmbeddings cross account capability langchain-ai#10634
langchain-ai#10184**

Dependencies: None
Tag maintainer:
Twitter handle:lethargicoder

Co-authored-by: Vikram(VS) <vssht@amazon.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Ɑ: models Related to LLMs or chat model modules 🤖:question A specific question about the codebase, product, project, or how to use a feature
Projects
None yet
Development

No branches or pull requests

2 participants