This example demonstrates how to use the Atomic Agents framework to analyze images with text, specifically focusing on extracting structured information from nutrition labels using GPT-4 Vision capabilities.
- Image Analysis: Process nutrition label images using GPT-4 Vision
- Structured Data Extraction: Convert visual information into structured Pydantic models
- Multi-Image Processing: Analyze multiple nutrition labels simultaneously
- Comprehensive Nutritional Data: Extract detailed nutritional information including:
- Basic nutritional facts (calories, fats, proteins, etc.)
- Serving size information
- Vitamin and mineral content
- Product details
-
Clone the main Atomic Agents repository:
git clone https://github.com/BrainBlend-AI/atomic-agents
-
Navigate to the basic-multimodal directory:
cd atomic-agents/atomic-examples/basic-multimodal
-
Install dependencies using Poetry:
poetry install
-
Set up environment variables: Create a
.env
file in thebasic-multimodal
directory with the following content:OPENAI_API_KEY=your_openai_api_key
Replace
your_openai_api_key
with your actual OpenAI API key. -
Run the example:
poetry run python basic_multimodal/main.py
Defines the structure for storing nutrition information, including:
- Macronutrients (fats, proteins, carbohydrates)
- Micronutrients (vitamins and minerals)
- Serving information
- Product details
NutritionAnalysisInput
: Handles input images and analysis instructionsNutritionAnalysisOutput
: Structures the extracted nutrition information
A specialized agent configured with:
- GPT-4 Vision capabilities
- Custom system prompts for nutrition label analysis
- Structured data validation
The example includes test images in the test_images
directory:
nutrition_label_1.png
: Example nutrition label imagenutrition_label_2.jpg
: Another example nutrition label image
Running the example will:
- Load the test images
- Process them through the nutrition analyzer
- Display structured nutritional information for each label
You can modify the example by:
- Adding your own nutrition label images to the
test_images
directory - Adjusting the
NutritionLabel
schema to capture additional information - Modifying the system prompt to focus on specific aspects of nutrition labels
Contributions are welcome! Please fork the repository and submit a pull request with your enhancements or bug fixes.
This project is licensed under the MIT License. See the LICENSE file for details.