Skip to content
This repository has been archived by the owner on Apr 5, 2024. It is now read-only.

Latest commit

 

History

History
78 lines (49 loc) · 2.78 KB

README.md

File metadata and controls

78 lines (49 loc) · 2.78 KB

Welcome to gpt-voice-assistant

Last Commit

This software builds on top of carter-voice-assistant project and replaces the Carter API with 👾 OpenAI API. With this integration, the assistant is able to provide more accurate and sophisticated responses to user input.

gpt-voice-assistant pixel-art by JVPC0D3R

🛠 how it works

GPT-3.5 is the core of the assistant, but this project uses other AI models to extract more data from the user and it's environment:

  • The first model implemented is 🦻 Whisper , which was prebuilt in the original Carter project. Whisper's goal is to listen to the user and transcript it's voice into text.

  • In order to give vision to the assistant, I used 👁 Ultralytics YOLOv8 model, which can detect, classify and track objects in real time.

  • To give the assistant access to the Internet I implemented a 🔍 SerpAPI based module.

  • In order for the assistant to know if the user wants to perform one action or another, I implemented a 📑 text classification model, which has to decide if the user input is a chat, a vision query, a google search or a farewell.

  • Also if the user command needs a google search before calling GPT, the assistant has to get arguments to call the SerpAPI. In order to do that I used a 🔑 keyword extraction model.

🛹 getting started

To run the gpt-voice-assistant, you will need to provide an OpenAI API and a SerpAPI key. I suggest creating a python file named keys.py to store the API key variables.

📦 installation

To install and run the gpt-voice-assistant, follow these steps:

git clone https://github.com/JVPRUGBIER/gpt-voice-assistant

Install the required dependencies:

pip install -r requirements.txt

Create a 'keys.py' file in the project directory and add your OpenAI and SerpAPI keys:

OPENAI_API_KEY = "your_api_key"
SERP_API_KEY = "your_api_key"

🏃 run the assistant:

Chat using text with GPT

python chat.py -t

Chat using text with GPT and let the assistant read the response out loud

python chat.py -t -v

Have a full speech chat with the gpt-voice-assistant

python chat.py -l -v