From 57c72085e39f70c9866692c0c58e2957d5f4d31d Mon Sep 17 00:00:00 2001 From: Giom-V Date: Wed, 18 Dec 2024 15:05:56 +0100 Subject: [PATCH 1/4] Adding links to the video analysis notebooks --- examples/README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/examples/README.md b/examples/README.md index c70265c13..ed7b25d29 100644 --- a/examples/README.md +++ b/examples/README.md @@ -5,7 +5,7 @@ This is a collection of fun examples for the Gemini API. * [Agents and Automatic Function Calling](https://github.com/google-gemini/cookbook/blob/main/examples/Agents_Function_Calling_Barista_Bot.ipynb): Create an agent (Barrista-bot) to take your coffee order. -* [Classify and Analyze a Video](https://github.com/google-gemini/cookbook/blob/main/examples/Analyze_a_Video_Classification.ipynb): This notebook uses multimodal capabilities of the Gemini model to classify the species of animals shown in a video. +* Video Analysis: Three notebooks using multimodal capabilities of the Gemini model to [classify the species of animals](./Analyze_a_Video_Classification.ipynb) for a video, [summarize one](./Analyze_a_Video_Summarization.ipynb) or [recognizing when it happened](./Analyze_a_Video_Historic_Event_Recognition.ipynb), * [Anomaly Detection](https://github.com/google-gemini/cookbook/blob/main/examples/Anomaly_detection_with_embeddings.ipynb): Use embeddings to detect anomalies in your datasets. * [Analyze a Video with Summarization](https://github.com/google-gemini/cookbook/blob/main/examples/Analyze_a_Video_Summarization.ipynb): This notebook shows how you can use Gemini API's multimodal capabilities for video summarization. * [Apollo 11 - long context example](https://github.com/google-gemini/cookbook/blob/main/examples/Apollo_11.ipynb): Search a 400 page transcript from Apollo 11. From bbf0a0c698dcdf76e6e84f036f0159b24425db33 Mon Sep 17 00:00:00 2001 From: Giom-V Date: Wed, 18 Dec 2024 15:07:01 +0100 Subject: [PATCH 2/4] Adding prompts so the notebook works. --- examples/Analyze_a_Video_Historic_Event_Recognition.ipynb | 2 +- examples/Analyze_a_Video_Summarization.ipynb | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/examples/Analyze_a_Video_Historic_Event_Recognition.ipynb b/examples/Analyze_a_Video_Historic_Event_Recognition.ipynb index f42266242..20fdef42d 100644 --- a/examples/Analyze_a_Video_Historic_Event_Recognition.ipynb +++ b/examples/Analyze_a_Video_Historic_Event_Recognition.ipynb @@ -268,7 +268,7 @@ "source": [ "model = genai.GenerativeModel(model_name=\"models/gemini-1.5-flash\", safety_settings=safety_settings,\n", " system_instruction=system_prompt)\n", - "response = model.generate_content([video_file])\n", + "response = model.generate_content([\"Analyze that video please\",video_file])\n", "print(response.text)" ] }, diff --git a/examples/Analyze_a_Video_Summarization.ipynb b/examples/Analyze_a_Video_Summarization.ipynb index 603944372..a33cfebe9 100644 --- a/examples/Analyze_a_Video_Summarization.ipynb +++ b/examples/Analyze_a_Video_Summarization.ipynb @@ -269,7 +269,7 @@ ], "source": [ "model = genai.GenerativeModel(model_name=\"models/gemini-1.5-flash\", system_instruction=system_prompt)\n", - "response = model.generate_content([video_file])\n", + "response = model.generate_content([\"Summarise that video please.\",video_file])\n", "print(response.text)" ] }, From 1e5699623895aea787731d58c5ac6f97b66d72e9 Mon Sep 17 00:00:00 2001 From: Giom-V Date: Wed, 18 Dec 2024 15:07:08 +0100 Subject: [PATCH 3/4] Relative links --- examples/README.md | 46 +++++++++++++++++++++++----------------------- 1 file changed, 23 insertions(+), 23 deletions(-) diff --git a/examples/README.md b/examples/README.md index ed7b25d29..1c52219be 100644 --- a/examples/README.md +++ b/examples/README.md @@ -4,25 +4,25 @@ This is a collection of fun examples for the Gemini API. -* [Agents and Automatic Function Calling](https://github.com/google-gemini/cookbook/blob/main/examples/Agents_Function_Calling_Barista_Bot.ipynb): Create an agent (Barrista-bot) to take your coffee order. +* [Agents and Automatic Function Calling](./Agents_Function_Calling_Barista_Bot.ipynb): Create an agent (Barrista-bot) to take your coffee order. * Video Analysis: Three notebooks using multimodal capabilities of the Gemini model to [classify the species of animals](./Analyze_a_Video_Classification.ipynb) for a video, [summarize one](./Analyze_a_Video_Summarization.ipynb) or [recognizing when it happened](./Analyze_a_Video_Historic_Event_Recognition.ipynb), -* [Anomaly Detection](https://github.com/google-gemini/cookbook/blob/main/examples/Anomaly_detection_with_embeddings.ipynb): Use embeddings to detect anomalies in your datasets. -* [Analyze a Video with Summarization](https://github.com/google-gemini/cookbook/blob/main/examples/Analyze_a_Video_Summarization.ipynb): This notebook shows how you can use Gemini API's multimodal capabilities for video summarization. -* [Apollo 11 - long context example](https://github.com/google-gemini/cookbook/blob/main/examples/Apollo_11.ipynb): Search a 400 page transcript from Apollo 11. -* [Clasify text with emeddings](https://github.com/google-gemini/cookbook/blob/main/examples/Classify_text_with_embeddings.ipynb): Use embeddings from the Gemini API with Keras. -* [Guess the shape](https://github.com/google-gemini/cookbook/blob/main/examples/Guess_the_shape.ipynb): A simple example of using images in prompts. -* [Market a Jet Backpack](https://github.com/google-gemini/cookbook/blob/main/examples/Market_a_Jet_Backpack.ipynb): Create a marketing campaign from a product sketch. -* [Object detection](https://github.com/google-gemini/cookbook/blob/main/examples/Object_detection.ipynb): Extensive examples with object detection, including with multiple classes, OCR, visual question answering, and even an interactive demo. -* [Opossum search](https://github.com/google-gemini/cookbook/blob/main/examples/Opossum_search.ipynb): Code generation with the Gemini API. Just for fun, you'll prompt the model to create a web app called "Opossum Search" that searches Google with "opossum" appended to the query. -* [Search Wikipedia with ReAct](https://github.com/google-gemini/cookbook/blob/main/examples/Search_Wikipedia_using_ReAct.ipynb): Use ReAct prompting with Gemini 1.5 Flash to search Wikipedia interactively. -* [Search Re-ranking with Embeddings](https://github.com/google-gemini/cookbook/blob/main/examples/Search_reranking_using_embeddings.ipynb): Use embeddings to re-rank search results. -* [Story writing with prompt chaining.ipynb](https://github.com/google-gemini/cookbook/blob/main/examples/Story_Writing_with_Prompt_Chaining.ipynb): Write a story using two powerful tools: prompt chaining and iterative generation. -* [Talk to documents](https://github.com/google-gemini/cookbook/blob/main/examples/Talk_to_documents_with_embeddings.ipynb): This is a basic intro to Retrieval Augmented Generation (RAG). Use embeddings to search through a custom database. -* [Upload files to Colab](https://github.com/google-gemini/cookbook/blob/main/examples/Upload_files_to_Colab.ipynb): This is a helper notebook that shows how to upload files from your local computer to Colab. Note: to upload files to the Gemini API (text, code, images, audio, video), check out the [Files quickstart](https://github.com/google-gemini/cookbook/blob/main/quickstarts/File_API.ipynb). -* [Voice Memos](https://github.com/google-gemini/cookbook/blob/main/examples/Voice_memos.ipynb): You'll use the Gemini API to help you generate ideas for your next blog post, based on voice memos you recorded on your phone, and previous articles you've written. -* [Translate a public domain](https://github.com/google-gemini/cookbook/blob/main/examples/Translate_a_Public_Domain_Book.ipynb): In this notebook, you will explore Gemini model as a translation tool, demonstrating how to prepare data, create effective prompts, and save results into a `.txt` file. -* [Working with Charts, Graphs, and Slide Decks](https://github.com/google-gemini/cookbook/blob/main/examples/Working_with_Charts_Graphs_and_Slide_Decks.ipynb): Gemini models are powerful multimodal LLMs that can process both text and image inputs. This notebook shows how Gemini 1.5 Flash model is capable of extracting data from various images. -* [Entity extraction](https://github.com/google-gemini/cookbook/blob/main/examples/Entity_Extraction.ipynb): Use Gemini API to speed up some of your tasks, such as searching through text to extract needed information. Entity extraction with a Gemini model is a simple query, and you can ask it to retrieve its answer in the form that you prefer. +* [Anomaly Detection](./Anomaly_detection_with_embeddings.ipynb): Use embeddings to detect anomalies in your datasets. +* [Analyze a Video with Summarization](./Analyze_a_Video_Summarization.ipynb): This notebook shows how you can use Gemini API's multimodal capabilities for video summarization. +* [Apollo 11 - long context example](./Apollo_11.ipynb): Search a 400 page transcript from Apollo 11. +* [Clasify text with emeddings](./Classify_text_with_embeddings.ipynb): Use embeddings from the Gemini API with Keras. +* [Guess the shape](./Guess_the_shape.ipynb): A simple example of using images in prompts. +* [Market a Jet Backpack](./Market_a_Jet_Backpack.ipynb): Create a marketing campaign from a product sketch. +* [Object detection](./Object_detection.ipynb): Extensive examples with object detection, including with multiple classes, OCR, visual question answering, and even an interactive demo. +* [Opossum search](./Opossum_search.ipynb): Code generation with the Gemini API. Just for fun, you'll prompt the model to create a web app called "Opossum Search" that searches Google with "opossum" appended to the query. +* [Search Wikipedia with ReAct](./Search_Wikipedia_using_ReAct.ipynb): Use ReAct prompting with Gemini 1.5 Flash to search Wikipedia interactively. +* [Search Re-ranking with Embeddings](./Search_reranking_using_embeddings.ipynb): Use embeddings to re-rank search results. +* [Story writing with prompt chaining.ipynb](./Story_Writing_with_Prompt_Chaining.ipynb): Write a story using two powerful tools: prompt chaining and iterative generation. +* [Talk to documents](./Talk_to_documents_with_embeddings.ipynb): This is a basic intro to Retrieval Augmented Generation (RAG). Use embeddings to search through a custom database. +* [Upload files to Colab](./Upload_files_to_Colab.ipynb): This is a helper notebook that shows how to upload files from your local computer to Colab. Note: to upload files to the Gemini API (text, code, images, audio, video), check out the [Files quickstart](https://github.com/google-gemini/cookbook/blob/main/quickstarts/File_API.ipynb). +* [Voice Memos](./Voice_memos.ipynb): You'll use the Gemini API to help you generate ideas for your next blog post, based on voice memos you recorded on your phone, and previous articles you've written. +* [Translate a public domain](./Translate_a_Public_Domain_Book.ipynb): In this notebook, you will explore Gemini model as a translation tool, demonstrating how to prepare data, create effective prompts, and save results into a `.txt` file. +* [Working with Charts, Graphs, and Slide Decks](./Working_with_Charts_Graphs_and_Slide_Decks.ipynb): Gemini models are powerful multimodal LLMs that can process both text and image inputs. This notebook shows how Gemini 1.5 Flash model is capable of extracting data from various images. +* [Entity extraction](./Entity_Extraction.ipynb): Use Gemini API to speed up some of your tasks, such as searching through text to extract needed information. Entity extraction with a Gemini model is a simple query, and you can ask it to retrieve its answer in the form that you prefer. ### Integrations @@ -30,9 +30,9 @@ This is a collection of fun examples for the Gemini API. ### Folders -* [Prompting examples](https://github.com/google-gemini/cookbook/tree/main/examples/prompting): A directory with examples of various prompting techniques. -* [JSON Capabilities](https://github.com/google-gemini/cookbook/tree/main/examples/json-capabilities): A directory with guides containing different types of tasks you can do with JSON schemas. -* [Automate Google Workspace tasks with the Gemini API](https://github.com/google-gemini/cookbook/tree/main/examples/Apps_script_and_Workspace_codelab): This codelabs shows you how to connect to the Gemini API using Apps Script, and uses the function calling, vision and text capabilities to automate Google Workspace tasks - summarizing a document, analyzing a chart, sending an email and generating some slides directly. All of this is done from a free text input. -* [Langchain examples](https://github.com/google-gemini/cookbook/tree/main/examples/langchain): A directory with multiple examples using Gemini with Langchain. +* [Prompting examples](./prompting): A directory with examples of various prompting techniques. +* [JSON Capabilities](./json-capabilities): A directory with guides containing different types of tasks you can do with JSON schemas. +* [Automate Google Workspace tasks with the Gemini API](./Apps_script_and_Workspace_codelab): This codelabs shows you how to connect to the Gemini API using Apps Script, and uses the function calling, vision and text capabilities to automate Google Workspace tasks - summarizing a document, analyzing a chart, sending an email and generating some slides directly. All of this is done from a free text input. +* [Langchain examples](./langchain): A directory with multiple examples using Gemini with Langchain. -There are even more examples in the [quickstarts](https://github.com/google-gemini/cookbook/tree/main/quickstarts) folder and in the [Awesome Gemini page](../Awesome_gemini.md). \ No newline at end of file +There are even more examples in the [quickstarts](../quickstarts) folder and in the [Awesome Gemini page](../Awesome_gemini.md). \ No newline at end of file From c5eeea842b7aed5f92f2f45b8c0ce0594b6981cd Mon Sep 17 00:00:00 2001 From: Giom-V Date: Wed, 18 Dec 2024 15:19:13 +0100 Subject: [PATCH 4/4] Removing duplicates --- examples/README.md | 20 -------------------- 1 file changed, 20 deletions(-) diff --git a/examples/README.md b/examples/README.md index 514e7f79f..9bf51af08 100644 --- a/examples/README.md +++ b/examples/README.md @@ -4,26 +4,6 @@ This is a collection of fun examples for the Gemini API. -* [Agents and Automatic Function Calling](https://github.com/google-gemini/cookbook/blob/main/examples/Agents_Function_Calling_Barista_Bot.ipynb): Create an agent (Barrista-bot) to take your coffee order. -* [Classify and Analyze a Video](https://github.com/google-gemini/cookbook/blob/main/examples/Analyze_a_Video_Classification.ipynb): This notebook uses multimodal capabilities of the Gemini model to classify the species of animals shown in a video. -* [Anomaly Detection](https://github.com/google-gemini/cookbook/blob/main/examples/Anomaly_detection_with_embeddings.ipynb): Use embeddings to detect anomalies in your datasets. -* [Analyze a Video with Summarization](https://github.com/google-gemini/cookbook/blob/main/examples/Analyze_a_Video_Summarization.ipynb): This notebook shows how you can use Gemini API's multimodal capabilities for video summarization. -* [Apollo 11 - long context example](https://github.com/google-gemini/cookbook/blob/main/examples/Apollo_11.ipynb): Search a 400 page transcript from Apollo 11. -* [Clasify text with emeddings](https://github.com/google-gemini/cookbook/blob/main/examples/Classify_text_with_embeddings.ipynb): Use embeddings from the Gemini API with Keras. -* [Guess the shape](https://github.com/google-gemini/cookbook/blob/main/examples/Guess_the_shape.ipynb): A simple example of using images in prompts. -* [Market a Jet Backpack](https://github.com/google-gemini/cookbook/blob/main/examples/Market_a_Jet_Backpack.ipynb): Create a marketing campaign from a product sketch. -* [Object detection](https://github.com/google-gemini/cookbook/blob/main/examples/Object_detection.ipynb): Extensive examples with object detection, including with multiple classes, OCR, visual question answering, and even an interactive demo. -* [Opossum search](https://github.com/google-gemini/cookbook/blob/main/examples/Opossum_search.ipynb): Code generation with the Gemini API. Just for fun, you'll prompt the model to create a web app called "Opossum Search" that searches Google with "opossum" appended to the query. -* [Search Wikipedia with ReAct](https://github.com/google-gemini/cookbook/blob/main/examples/Search_Wikipedia_using_ReAct.ipynb): Use ReAct prompting with Gemini 1.5 Flash to search Wikipedia interactively. -* [Search Re-ranking with Embeddings](https://github.com/google-gemini/cookbook/blob/main/examples/Search_reranking_using_embeddings.ipynb): Use embeddings to re-rank search results. -* [Story writing with prompt chaining.ipynb](https://github.com/google-gemini/cookbook/blob/main/examples/Story_Writing_with_Prompt_Chaining.ipynb): Write a story using two powerful tools: prompt chaining and iterative generation. -* [Talk to documents](https://github.com/google-gemini/cookbook/blob/main/examples/Talk_to_documents_with_embeddings.ipynb): This is a basic intro to Retrieval Augmented Generation (RAG). Use embeddings to search through a custom database. -* [Upload files to Colab](https://github.com/google-gemini/cookbook/blob/main/examples/Upload_files_to_Colab.ipynb): This is a helper notebook that shows how to upload files from your local computer to Colab. Note: to upload files to the Gemini API (text, code, images, audio, video), check out the [Files quickstart](https://github.com/google-gemini/cookbook/blob/main/quickstarts/File_API.ipynb). -* [Voice Memos](https://github.com/google-gemini/cookbook/blob/main/examples/Voice_memos.ipynb): You'll use the Gemini API to help you generate ideas for your next blog post, based on voice memos you recorded on your phone, and previous articles you've written. -* [Translate a public domain](https://github.com/google-gemini/cookbook/blob/main/examples/Translate_a_Public_Domain_Book.ipynb): In this notebook, you will explore Gemini model as a translation tool, demonstrating how to prepare data, create effective prompts, and save results into a `.txt` file. -* [Working with Charts, Graphs, and Slide Decks](https://github.com/google-gemini/cookbook/blob/main/examples/Working_with_Charts_Graphs_and_Slide_Decks.ipynb): Gemini models are powerful multimodal LLMs that can process both text and image inputs. This notebook shows how Gemini 1.5 Flash model is capable of extracting data from various images. -* [Entity extraction](https://github.com/google-gemini/cookbook/blob/main/examples/Entity_Extraction.ipynb): Use Gemini API to speed up some of your tasks, such as searching through text to extract needed information. Entity extraction with a Gemini model is a simple query, and you can ask it to retrieve its answer in the form that you prefer. -* [Generate a company research report using search grounding](https://github.com/google-gemini/cookbook/blob/main/examples/search_grounding_for_research_report.ipynb): Use search grounding to write a company research report with Gemini 1.5 Flash. * [Agents and Automatic Function Calling](./Agents_Function_Calling_Barista_Bot.ipynb): Create an agent (Barrista-bot) to take your coffee order. * Video Analysis: Three notebooks using multimodal capabilities of the Gemini model to [classify the species of animals](./Analyze_a_Video_Classification.ipynb) for a video, [summarize one](./Analyze_a_Video_Summarization.ipynb) or [recognizing when it happened](./Analyze_a_Video_Historic_Event_Recognition.ipynb), * [Anomaly Detection](./Anomaly_detection_with_embeddings.ipynb): Use embeddings to detect anomalies in your datasets.