Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add basic information extraction example #159

Merged
merged 4 commits into from
May 30, 2024
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
318 changes: 318 additions & 0 deletions examples/prompting/Basic_Information_Extraction.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,318 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {
"id": "p3A9q4LNh1bL"
},
"source": [
"##### Copyright 2024 Google LLC."
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {
"cellView": "form",
"id": "KGxPOhGBh2Xy"
},
"outputs": [],
"source": [
"# @title Licensed under the Apache License, Version 2.0 (the \"License\");\n",
"# you may not use this file except in compliance with the License.\n",
"# You may obtain a copy of the License at\n",
"#\n",
"# https://www.apache.org/licenses/LICENSE-2.0\n",
"#\n",
"# Unless required by applicable law or agreed to in writing, software\n",
"# distributed under the License is distributed on an \"AS IS\" BASIS,\n",
"# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n",
"# See the License for the specific language governing permissions and\n",
"# limitations under the License."
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "sP8PQnz1QrcF"
},
"source": [
"# Gemini API: Basic information extraction"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "bxGr_x3MRA0z"
},
"source": [
"<table class=\"tfo-notebook-buttons\" align=\"left\">\n",
" <td>\n",
" <a target=\"_blank\" href=\"https://colab.research.google.com/github/google-gemini/cookbook/blob/main/examples/prompting/Basic_Information_Extraction.ipynb\"><img src = \"https://www.tensorflow.org/images/colab_logo_32px.png\"/>Run in Google Colab</a>\n",
" </td>\n",
"</table>"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "ysy--KfNRrCq"
},
"source": [
"This example notebook shows how Gemini API's Python SDK can be used to extract information from a block of text and return it in defined structure.\n",
"\n",
"In this notebook, the LLM is given a recipe and is asked to extract all the ingredients to create a shopping list. According to best practices, complex tasks will be executed better if divided into separate steps, such as:\n",
"\n",
"1. First, the model will extract all the groceries into a list.\n",
"\n",
"2. Then, you will prompt it to convert this list into a shopping list.\n",
"\n",
"You can find more tips for writing prompts [here](https://ai.google.dev/gemini-api/docs/prompting-intro).\n"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {
"id": "Ne-3gnXqR0hI"
},
"outputs": [],
"source": [
"!pip install -U -q google-generativeai"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {
"id": "EconMHePQHGw"
},
"outputs": [],
"source": [
"import google.generativeai as genai\n",
"\n",
"from IPython.display import Markdown"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "eomJzCa6lb90"
},
"source": [
"## Configure your API key\n",
"\n",
"To run the following cell, your API key must be stored it in a Colab Secret named `GOOGLE_API_KEY`. If you don't already have an API key, or you're not sure how to create a Colab Secret, see [Authentication](https://github.com/google-gemini/cookbook/blob/main/quickstarts/Authentication.ipynb) for an example."
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {
"id": "v-JZzORUpVR2"
},
"outputs": [],
"source": [
"from google.colab import userdata\n",
"GOOGLE_API_KEY=userdata.get('GOOGLE_API_KEY')\n",
"\n",
"genai.configure(api_key=GOOGLE_API_KEY)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "L-Wt23A_uzFZ"
},
"source": [
"## Example\n",
"\n",
"First, start by extracting all the groceries. To dod this, set the system instructions when defining the model"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {
"id": "x-Mf5-Vsw2Ft"
},
"outputs": [],
"source": [
"groceries_system_prompt = f\"\"\"\n",
"Your task is to extract to a list all the groceries with its quantities based on the provided recipe.\n",
"Make sure that groceries are in the order of appearance.\n",
"\"\"\"\n",
"grocery_extraction_model = genai.GenerativeModel(model_name='gemini-1.5-flash-latest',\n",
" system_instruction=groceries_system_prompt)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "4YJRWBbviSeC"
},
"source": [
"Next, the recipe is defined. You will pass the recipe into `generate_content`, and see that the list of groceries was successfully extracted from the input."
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {
"id": "yebFPUvcxDdZ"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"- 3 garlic cloves\n",
"- knob of fresh ginger\n",
"- 3 spring onions\n",
"- 2 tbsp clear honey\n",
"- 1 orange\n",
"- 1 tbsp light soy sauce\n",
"- 2 tbsp vegetable oil\n",
"- 4 small chicken breast fillets\n",
"- 20 button mushrooms\n",
"- 20 cherry tomatoes\n",
"- 2 large red peppers\n",
"- 20 wooden skewers \n",
"\n"
]
}
],
"source": [
"recipe = \"\"\"\n",
"Step 1:\n",
"Grind 3 garlic cloves, knob of fresh ginger, roughly chopped, 3 spring onions to a paste in a food processor.\n",
"Add 2 tbsp of clear honey, juice from one orange, 1 tbsp of light soy sauce and 2 tbsp of vegetable oil, then blend again.\n",
"Pour the mixture over the cubed chicken from 4 small breast fillets and leave to marnate for at least 1hr.\n",
"Toss in the 20 button mushrooms for the last half an hour so the take on some of the flavour, too.\n",
"\n",
"Step 2:\n",
"Thread the chicken, 20 cherry tomatoes, mushrooms and 2 large red peppers onto 20 wooden skewers,\n",
"then cook on a griddle pan for 7-8 mins each side or until the chicken is thoroughly cooked and golden brown.\n",
"Turn the kebabs frequently and baste with the marinade from time to time until evenly cooked.\n",
"Arrange on a platter, and eat with your fingers.\n",
"\"\"\"\n",
"\n",
"grocery_list = grocery_extraction_model.generate_content(recipe)\n",
"print(grocery_list.text)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "w0IH1dd3jSes"
},
"source": [
"The next step is to further format the shopping list based on the ingredients extracted."
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {
"id": "sU0pld4QQqOe"
},
"outputs": [],
"source": [
"shopping_list_system_prompt = \"\"\"\n",
"You are given a list of groceries. Complete the following:\n",
"- Organize groceries into categories for easier shopping.\n",
"- List each item one under another with a checkbox [].\n",
"\"\"\"\n",
"\n",
"shopping_list_model = genai.GenerativeModel(model_name='gemini-1.5-flash-latest',\n",
" system_instruction=shopping_list_system_prompt)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "Ea84Nf2rkWX9"
},
"source": [
"Now that you have defined the instructions, you can also decide how you want to format your grocery list. Give the prompt a couple examples, or perform few-shot prompting, so it understands how to format your grocery list."
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {
"id": "3QSf7m5QxmC-"
},
"outputs": [
{
"data": {
"text/markdown": [
"## PRODUCE\n",
"- [ ] 3 garlic cloves\n",
"- [ ] knob of fresh ginger\n",
"- [ ] 3 spring onions\n",
"- [ ] 1 orange\n",
"- [ ] 20 button mushrooms\n",
"- [ ] 20 cherry tomatoes\n",
"- [ ] 2 large red peppers\n",
"\n",
"## PANTRY\n",
"- [ ] 2 tbsp clear honey\n",
"- [ ] 1 tbsp light soy sauce\n",
"- [ ] 2 tbsp vegetable oil\n",
"\n",
"## MEAT\n",
"- [ ] 4 small chicken breast fillets\n",
"\n",
"## OTHER\n",
"- [ ] 20 wooden skewers \n"
],
"text/plain": [
"<IPython.core.display.Markdown object>"
]
},
"execution_count": 8,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"shopping_list_prompt = f\"\"\"\n",
"LIST: 3 tomatoes, 1 turkey, 4 tomatoes\n",
"OUTPUT:\n",
"## VEGETABLES\n",
"- [ ] 7 tomatoes\n",
"## MEAT\n",
"- [ ] 1 turkey\n",
"\n",
"LIST: {grocery_list.text}\n",
"OUTPUT:\n",
"\"\"\"\n",
"Markdown(shopping_list_model.generate_content(shopping_list_prompt).text)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "PhttRO0TD9mN"
},
"source": [
"## Next steps\n",
"\n",
"Be sure to explore other examples of prompting in the repository. Try creating your own prompts for information extraction or adapt the ones provided in the notebook."
]
}
],
"metadata": {
"colab": {
"name": "Basic_Information_Extraction.ipynb",
"toc_visible": true
},
"kernelspec": {
"display_name": "Python 3",
"name": "python3"
}
},
"nbformat": 4,
"nbformat_minor": 0
}
Loading