Image Generator and Editor

Description

This project is a proof of concept aimed at analyzing the functionality and integration of OpenAI's function calling with APIs to automate various operations, connections, and implementations through natural language. It is an interactive application designed to generate and edit images using APIs like Ideogram and OpenAI. It operates based on the context provided by the user, enabling dynamic decision-making for image generation and editing. Additionally, it remembers previously generated or user-provided images, ensuring a seamless and intuitive workflow.

Features

Image Generation:
- Allows users to create images from text descriptions (prompts).
- Supports optional parameters like style, model, aspect ratio, and more for customization.
- The generated image is stored in the context for future actions.
Image Editing:
- Users can edit previously generated images or those provided via a URL.
- A white mask that covers the entire image is automatically generated for complete modifications.
- Enables users to apply styles and adjustments based on their needs.
Context Management:
- The application stores:
  - Last generated image: An image created by the application.
  - Last provided image: An image manually provided by the user.
- This context allows dynamic image editing, whether for the last generated or provided image.
Interactive History:
- Keeps a record of actions performed, allowing users to review the conversation flow.

Decision Flow

1. User Input

The user describes their desired action: generate, edit, or ask questions.
If the input includes a valid URL, it is detected and stored in the context as the last provided image.

2. Request Analysis

The application uses a model to decide whether the user intends to:
- Generate a new image.
- Edit an existing image.
- Take no action.

3. Image Generation

If image generation is chosen:
- A request is constructed with the parameters provided by the user.
- The generated image is saved in the context as the last generated image.
- The URL of the created image is displayed to the user.

4. Image Editing

If image editing is chosen:
- An image is automatically selected from the context:
  - If the user provided a URL, it takes priority.
  - Otherwise, the last image generated by the application is used.
- A white mask is generated to cover the entire image.
- The editing request is sent to the API with the defined parameters.
- The newly edited image is saved in the context as the last generated image.

5. Parameter Confirmation

Before performing any action, the parameters are presented to the user.
The user can confirm or adjust the parameters before proceeding.

Flow Structure

Start:
- The user provides an initial request.
- If a URL is included, it is automatically saved in the context.
Decision:
- The request is analyzed to determine if it is a generation, editing, or general inquiry action.
Action Execution:
- Generate:
  - A new image is created based on the prompt and optional parameters.
- Edit:
  - An image from the context (provided or previously generated) is used.
  - A mask is generated, and the editing request is sent.
- No Action:
  - The conversation continues without performing any operation.
History and Context:
- Every action is logged in an interactive history.
- Generated or edited images are saved in the context for future reference.

Key Features

Image URL Detection:
- Analyzes user input to identify valid URLs.
- Detected URLs are stored as the last provided image.
Automatic Mask:
- Generates a fully white mask that covers the original image.
- Allows complete edits without manual intervention.
Context Persistence:
- Remembers previously used images, enabling continuous workflows and history-dependent actions.
Parameter Confirmation:
- Users can review and adjust parameters before executing any action.
Automatic Temporary File Management:
- Downloaded images and generated masks are automatically deleted after each operation.

Usage Example

Image Generation

User:
"I want a cat on a pool table."

System:
"Image generation detected.
Prompt: A cat on a pool table.
Do you want to proceed with these parameters? (y/n)"

Result:
The image is generated and saved in the context.

Image Editing

User:
"I want the cat to be orange."

System:
"Image editing detected.
Prompt: Change the color of the cat to orange.
Image used: URL of the last generated image.
Do you want to proceed with these parameters? (y/n)"

Result:
The image is edited, and the new version is saved in the context.

Project Requirements

Dependencies:
- Install the dependencies using the following command:
```
pip install -r requirements.txt
```

Environment Variables:

Create a .env file with the following variables:

IDEOGRAM_API_KEY=<your_ideogram_api_key>
OPENAI_API_KEY=<your_openai_api_key>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Image Generator and Editor

Description

Features

Decision Flow

1. User Input

2. Request Analysis

3. Image Generation

4. Image Editing

5. Parameter Confirmation

Flow Structure

Key Features

Usage Example

Image Generation

Image Editing

Project Requirements

Files

README.md

Latest commit

History

README.md

File metadata and controls

Image Generator and Editor

Description

Features

Decision Flow

1. User Input

2. Request Analysis

3. Image Generation

4. Image Editing

5. Parameter Confirmation

Flow Structure

Key Features

Usage Example

Image Generation

Image Editing

Project Requirements