Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -11,8 +11,10 @@
"This module contains:\n",
"1. [Overview](#1-Overview)\n",
"2. [Pre-requisites](#2-Pre-requisites)\n",
"3. [How to leverage maximum number of results](#3-how-to-leverage-the-maximum-number-of-results-feature)\n",
"4. [How to use custom prompting](#4-how-to-use-the-custom-prompting-feature)"
"3. [Understanding RetrieveAndGenerate API](#understanding-retrieveandgenerate-api)\n",
"4. [Sreaming response using RetrieveAndGenerate API](#streaming-response-with-retrieveandgenerate-api)\n",
"5. [Adjust 'maximum number of results' retrieval parameter](#3-how-to-leverage-the-maximum-number-of-results-feature)\n",
"6. [How to use custom prompting](#4-how-to-use-the-custom-prompting-feature)"
]
},
{
Expand Down Expand Up @@ -107,6 +109,7 @@
"import json\n",
"import boto3\n",
"import pprint\n",
"import sys\n",
"from botocore.exceptions import ClientError\n",
"from botocore.client import Config\n",
"\n",
Expand Down Expand Up @@ -134,8 +137,8 @@
},
"outputs": [],
"source": [
"%store -r kb_id\n",
"# kb_id = \"<<knowledge_base_id>>\" # Replace with your knowledge base id here."
"# %store -r kb_id\n",
"kb_id = \"<<knowledge_base_id>>\" # Replace with your knowledge base id here."
]
},
{
Expand All @@ -159,7 +162,7 @@
{
"cell_type": "code",
"execution_count": null,
"id": "ca915234",
"id": "bf0243f5",
"metadata": {},
"outputs": [],
"source": [
Expand All @@ -174,9 +177,17 @@
"$search_results$\n",
"\n",
"$output_format_instructions$\n",
"\"\"\"\n",
"\n",
"def retrieve_and_generate(query, kb_id, model_arn, max_results, prompt_template = default_prompt):\n",
"\"\"\""
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "ca915234",
"metadata": {},
"outputs": [],
"source": [
"def retrieve_and_generate(query, kb_id, model_arn, max_results=5, prompt_template = default_prompt):\n",
" response = bedrock_agent_client.retrieve_and_generate(\n",
" input={\n",
" 'text': query\n",
Expand All @@ -202,24 +213,10 @@
" return response\n"
]
},
{
"cell_type": "markdown",
"id": "a58b7808",
"metadata": {},
"source": [
"### How to leverage the maximum number of results feature\n",
"\n",
"In some use cases; the FM responses might be lacking enough context to provide relevant answers or relying that it couldn't find the requested info. Which could be fixed by modifying the maximum number of retrieved results.\n",
"\n",
"In the following example, we are going to run the following query with a few number of results (5):\n",
"\\\n",
"```Provide a list of risks for Octank financial in bulleted points.```\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "e2918161",
"id": "ccd657e6",
"metadata": {},
"outputs": [],
"source": [
Expand All @@ -241,6 +238,104 @@
" pprint.pp(contexts)\n"
]
},
{
"cell_type": "markdown",
"id": "5f1d6784",
"metadata": {},
"source": [
"### Test RetrieveAndGenerate API"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "dbefffdd",
"metadata": {},
"outputs": [],
"source": [
"query = \"\"\"Provide a list of risks for Octank financial in numbered list without description.\"\"\"\n",
"\n",
"results = retrieve_and_generate(query = query, kb_id = kb_id, model_arn = model_arn)\n",
"\n",
"print_generation_results(results)"
]
},
{
"cell_type": "markdown",
"id": "f6d8439e",
"metadata": {},
"source": [
"### Streaming response with RetrieveAndGenerate API\n",
"\n",
"Using new [streaming API](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent-runtime_RetrieveAndGenerateStream.html) customers can use `retrieve_and_generate_stream` API from Amazon Bedrock Knowledge Bases to receive the response as it is being generated by the Foundation Model (FM), rather than waiting for the complete response. This will help customers to reduce the time to first token in case of latency sensitive applications."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "86a3a94a",
"metadata": {},
"outputs": [],
"source": [
"def retrieve_and_generate_stream(query, kb_id, model_arn, max_results=5, prompt_template = default_prompt):\n",
" response = bedrock_agent_client.retrieve_and_generate_stream(\n",
" input={\n",
" 'text': query\n",
" },\n",
" retrieveAndGenerateConfiguration={\n",
" 'type': 'KNOWLEDGE_BASE',\n",
" 'knowledgeBaseConfiguration': {\n",
" 'knowledgeBaseId': kb_id,\n",
" 'modelArn': model_arn, \n",
" 'retrievalConfiguration': {\n",
" 'vectorSearchConfiguration': {\n",
" 'numberOfResults': max_results # will fetch top N documents which closely match the query\n",
" }\n",
" },\n",
" 'generationConfiguration': {\n",
" 'promptTemplate': {\n",
" 'textPromptTemplate': prompt_template\n",
" }\n",
" }\n",
" }\n",
" }\n",
" )\n",
"\n",
" for event in response['stream']:\n",
" if 'output' in event:\n",
" chunk = event['output']\n",
" sys.stdout.write(chunk['text'])\n",
" sys.stdout.flush()\n",
"\n",
" \n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "a55d95ce",
"metadata": {},
"outputs": [],
"source": [
"query = \"\"\"Provide a list of risks for Octank financial in numbered list without description.\"\"\"\n",
"\n",
"retrieve_and_generate_stream(query = query, kb_id = kb_id, model_arn = model_arn)"
]
},
{
"cell_type": "markdown",
"id": "a58b7808",
"metadata": {},
"source": [
"### Adjust 'maximum number of results' retrieval parameter\n",
"\n",
"In some use cases; the FM responses might be lacking enough context to provide relevant answers or relying that it couldn't find the requested info. Which could be fixed by modifying the maximum number of retrieved results.\n",
"\n",
"In the following example, we are going to run the following query with a few number of results (3):\n",
"\\\n",
"```Provide a list of risks for Octank financial in bulleted points.```\n"
]
},
{
"cell_type": "code",
"execution_count": null,
Expand Down Expand Up @@ -990,9 +1085,9 @@
],
"instance_type": "ml.t3.medium",
"kernelspec": {
"display_name": "Python 3 (Data Science 3.0)",
"display_name": "Python 3",
"language": "python",
"name": "python3__SAGEMAKER_INTERNAL__arn:aws:sagemaker:us-west-2:236514542706:image/sagemaker-data-science-310-v1"
"name": "python3"
},
"language_info": {
"codemirror_mode": {
Expand All @@ -1004,7 +1099,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.6"
"version": "3.10.13"
}
},
"nbformat": 4,
Expand Down
Loading