This repository has been archived by the owner on Nov 27, 2024. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 2
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #1 from josebenitezg/add_tts
add new features
- Loading branch information
Showing
6 changed files
with
266 additions
and
30 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -3,7 +3,9 @@ venv/ | |
data/ | ||
*.png | ||
*.mp4 | ||
*.mp3 | ||
*.jpg | ||
build/ | ||
dist/ | ||
*.egg-info/ | ||
*.egg-info/ | ||
test_api.py |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,56 +1,136 @@ | ||
## VisionAPI 👀 🚧 | ||
# VisionAPI 👓✨ - AI Vision & Language Processing | ||
|
||
#### Hey there | ||
### Welcome to the Future of AI Vision 🌟 | ||
|
||
This is a Work In Progress Project. | ||
The goal is to bring GPT-based Models to a simple API | ||
Hello and welcome to VisionAPI, where cutting-edge GPT-based models meet simplicity in a sleek API interface. Our mission is to harness the power of AI to work with images, videos, and audio to create Apps fasther than ever. | ||
|
||
### How to use | ||
### 🚀 Getting Started | ||
|
||
##### Installation | ||
#### Prerequisites | ||
|
||
Make sure you have Python installed on your system and you're ready to dive into the world of AI. | ||
|
||
#### 📦 Installation | ||
|
||
To install VisionAPI, simply run the following command in your terminal: | ||
|
||
```bash | ||
pip install visionapi | ||
``` | ||
##### Authentication | ||
##### 🔑 Authentication | ||
Before you begin, authenticate your OpenAI API key with the following command: | ||
|
||
```bash | ||
export OPENAI_API_KEY=<your key> | ||
export OPENAI_API_KEY='your-api-key-here' | ||
``` | ||
##### Image Inference | ||
We can use an image url, local image path or numpy array to make an inference. | ||
#### 🔩 Usage | ||
##### 🖼️ Image Inference | ||
Empower your applications to understand and describe images with precision. | ||
|
||
```python | ||
import visionapi | ||
|
||
inference_endpoint = visionapi.Inference() | ||
# Initialize the Inference Engine | ||
inference = visionapi.Inference() | ||
|
||
# Provide an image URL or a local path | ||
image = "https://upload.wikimedia.org/wikipedia/commons/thumb/d/dd/Gfp-wisconsin-madison-the-nature-boardwalk.jpg/2560px-Gfp-wisconsin-madison-the-nature-boardwalk.jpg" | ||
|
||
prompt = "Describe the image" | ||
# Set your descriptive prompt | ||
prompt = "What is this image about?" | ||
|
||
response = inference_endpoint.image_inference(image, prompt) | ||
# Get the AI's perspective | ||
response = inference.image(image, prompt) | ||
|
||
# Revel in the AI-generated description | ||
print(response.message.content) | ||
|
||
|
||
``` | ||
##### Video Inference | ||
##### 🎥 Video Inference | ||
Narrate the stories unfolding in your videos with our AI-driven descriptions. | ||
|
||
```python | ||
import visionapi | ||
|
||
inference_endpoint = visionapi.Inference() | ||
# Gear up the Inference Engine | ||
inference = visionapi.Inference() | ||
|
||
prompt = "These are frames from a video that I want to upload. Generate a compelling description that I can upload along with the video." | ||
# Craft a captivating prompt | ||
prompt = "Summarize the key moments in this video." | ||
|
||
video = "video.mp4" | ||
# Point to your video file | ||
video = "path/to/video.mp4" | ||
|
||
response = inference_endpoint.video_inference(video, prompt) | ||
# Let the AI weave the narrative | ||
response = inference.video(video, prompt) | ||
|
||
# Display the narrative | ||
print(response.message.content) | ||
|
||
``` | ||
|
||
##### 🎨 Image Generation | ||
Watch your words paint pictures with our intuitive image generation capabilities. | ||
|
||
```python | ||
import visionapi | ||
|
||
# Activate the Inference Engine | ||
inference = visionapi.Inference() | ||
|
||
# Describe your vision | ||
prompt = "A tranquil lake at sunset with mountains in the background." | ||
|
||
# Bring your vision to life | ||
image_urls = inference.generate_image(prompt, save=True) # Set `save=True` to store locally | ||
|
||
# Behold the AI-crafted imagery | ||
print(image_urls) | ||
``` | ||
|
||
##### 🗣️ TTS (Text to Speech) | ||
Transform your text into natural-sounding speech with just a few lines of code. | ||
|
||
```python | ||
import visionapi | ||
|
||
# Power up the Inference Engine | ||
inference = visionapi.Inference() | ||
|
||
# Specify where to save the audio | ||
save_path = "output/speech.mp3" | ||
|
||
# Type out what you need to vocalize | ||
text = "Hey, ready to explore AI-powered speech synthesis?" | ||
|
||
# Make the AI speak | ||
inference.TTS(text, save_path) | ||
``` | ||
|
||
##### 🎧 STT (Speech to Text) | ||
Convert audio into text with unparalleled clarity, opening up a world of possibilities. | ||
|
||
```python | ||
import visionapi | ||
|
||
# Initialize the Inference Engine | ||
inference = visionapi.Inference() | ||
|
||
# Convert spoken words to written text | ||
text = inference.STT('path/to/audio.mp3') | ||
|
||
# Marvel at the transcription | ||
print(text) | ||
``` | ||
|
||
## 🌐 Contribute | ||
Add cool stuff: | ||
|
||
- Fork the repository. | ||
- Extend the capabilities by integrating more models. | ||
- Enhance existing features or add new ones. | ||
- Submit a pull request with your improvements. | ||
|
||
Your contributions are what make VisionAPI not just a tool, but a community. | ||
|
||
Contribute to this project by adding more models and features. |
Binary file not shown.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Binary file not shown.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters