Skip to content

Commit

Permalink
Merge pull request #3 from ajhous44/feature/inputconsolidation
Browse files Browse the repository at this point in the history
Cody Consolidation + Rework
  • Loading branch information
ajhous44 authored Aug 13, 2023
2 parents 2218109 + 5e1e7f0 commit b2d6b6c
Show file tree
Hide file tree
Showing 7 changed files with 85 additions and 204 deletions.
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1 +1,2 @@
.venv
.env
1 change: 1 addition & 0 deletions .local.env
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
OPENAI_API_KEY=YOUR_API_KEY_HERE
21 changes: 10 additions & 11 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,17 +12,17 @@ Cody continuously updates its knowledge base every time you save a file, ensurin

## 🚀 Getting Started

1. Set the environment variable `OPENAI_API_KEY` in a `.env` file with your OpenAI API key.
2. Modify the `ignore_list` in the `if __name__ == "__main__":` section of the script to specify directories and files you wish to exclude from monitoring.
3. Run the script using Python: python codyv4.py or codyv5.py

4. Once the script is running, type 'Q' and press enter to switch to question mode. Cody is ready to answer your queries!
1. Clone the repo
2. (Optionally) Setup virtual environment by running `pip install -m venv .venv` and then `pip install -r requirements.txt` in terminal from the root of your directory
3. Rename the `.local.env` file to `.env`` and replace `YOUR_API_KEY_HERE` with your OpenAI API Key.
4. Modify the `IGNORE_THESE` global var at the top of the script to specify directories and files you wish to exclude from monitoring. (You should comment out any large files like a virtual environment, cache, js libraries you have downloaded, etc...)
5. Run the script using Python: python cody.py and follow terminal for setup. It will prompt you for if you want to use text chat (terminal) or conversational (speech i/o). It will also warn you if you remove .env from the ignore list.

## 🎯 Features

- **File Monitoring**: Real-time monitoring of all files in your project's directory and subdirectories. 👀
- **Embedding-based Knowledge Base**: Create a knowledge base using OpenAI Embeddings. Cody collects the contents of all text and JSON files and adds them to this knowledge base. 📚
- **Interactive Querying**: Listen to user inputs. Ask questions, and Cody will generate a response using the knowledge base. 🧠
- **Interactive Q&A**: Listen to user inputs. Ask questions, and Cody will generate a response using the knowledge base. 🧠
- **Customizable**: Easily specify files or directories to ignore during monitoring.

## 🛠 Dependencies
Expand All @@ -37,17 +37,16 @@ Cody continuously updates its knowledge base every time you save a file, ensurin

## 💡 Usage

- For version V4 (Text Interaction), type 'Q' and press enter. Cody will prompt you to input your question. Once you've entered your query, Cody will generate a response based on its knowledge base. Use Cody to debug code, troubleshoot errors, ask for help in adding new features, understand how functions interact across files, and more.

- For version V5 (Voice Interaction), simply speak to Cody, and it will respond accordingly. Cody is here to assist you with various programming tasks, making it a valuable tool in your coding journey.

To stop the script, type 'exit' or speak the word 'exit' and press enter. Cody will gracefully terminate the program.
- To stop the script, type 'exit' or speak the word 'exit' and press enter. Cody will gracefully terminate the program.

## ⚠️ Notes & Tips

- Cody uses the FAISS library for efficient similarity search in storing vectors. Please ensure you have sufficient memory available, especially when monitoring a large number of files.
- Additionally, be sure to monitor your OpenAI api usage. A helpful tip is to set a monthly spend limit inside of your OpenAI account to prevent anything crazy from happening. As an additional helper, it prints the number of tokens used in each call you make.
- "LIVE" coding questions. To use to it's full potential. I recommend opening a seperate terminal or even command prompt cd'ing into your project directory, and then launching python cody.py. Then place it split screen with your code in a small viewing window on the far left or right. This way, you can use a seperate terminal for actually running your code without worrying about Cody or having to run him (er... it) each time! This will still continue to update with each file save you do on any file so it always is using the latest data.

## Contributing

Contributions are welcome. Please submit a pull request or open an issue for any bugs or feature requests.

Happy Coding with Cody! 💡🚀🎉
Binary file removed audio/response.mp3
Binary file not shown.
127 changes: 73 additions & 54 deletions codyV5.py → cody.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@
from langchain.vectorstores import FAISS
from watchdog.observers import Observer
from watchdog.events import FileSystemEventHandler
import tempfile
import json
import time
import threading
Expand All @@ -15,6 +16,11 @@

load_dotenv()
OPENAI_API_KEY = os.getenv("OPENAI_API_KEY")

### USER OPTIONS ###
### MAX TOKENS PER CALL: MAX TOKENS TO USE FOR CALL
MAX_TOKENS_PER_CALL = 2500 # MAX TOKENS TO USE FOR CALL
IGNORE_THESE = ['.venv', '.env', 'static', 'dashboard/static', 'audio', 'license.md', '.github', '__pycache__']
r = sr.Recognizer()

class FileChangeHandler(FileSystemEventHandler):
Expand All @@ -38,12 +44,18 @@ def should_ignore(self, filename):
def on_modified(self, event):
if "response.mp3" not in event.src_path:
if not self.should_ignore(event.src_path):
print(f'\n \U0001F4BE The file {event.src_path} has changed!')
print(f'\n\U0001F4BE The file {event.src_path} has changed!')
self.update_file_content()

def update_file_content(self):
print("\U0001F4C1 Collecting files...")
print("\n\U0001F4C1 Collecting files...")
all_files_data = {}
# Check if ".env" is in ignore list, if not prompt warning "Are you sure you want to include your .env in your api call to OpenAI?"
if ".env" not in self.ignore_list:
response = input("😨 You removed .env from ignore list. This may expose .env variables to OpenAI. Confirm? (1 for Yes, 2 for exit):")
if response != "1":
print("\n😅 Phew. Close one... Operation aborted. Please add '.env' to your ignore list and try again.")
exit()
for root, dirs, files in os.walk('.'):
# Remove directories in the ignore list
dirs[:] = [d for d in dirs if d not in self.ignore_list]
Expand All @@ -54,6 +66,7 @@ def update_file_content(self):
with open(file_path, 'r') as file:
if filename.endswith('.json'):
json_data = json.load(file)
all_files_data[file_path] = json_data # Store JSON data in the dictionary
else:
lines = file.readlines()
line_data = {}
Expand Down Expand Up @@ -81,8 +94,8 @@ def update_file_content(self):
self.knowledge_base = FAISS.from_texts(chunks, self.embeddings)

print("\U00002705 All set!")
create_audio("Files updated. Ready for questions", "audio/response.mp3")
play_audio("audio/response.mp3")
audio_stream = create_audio("Files updated. Ready for questions")
play_audio(audio_stream)

def play_audio(file_path):
"""
Expand All @@ -96,83 +109,90 @@ def play_audio(file_path):
continue

pygame.mixer.music.unload()
os.unlink(file_path) # Delete the temporary file
print("Deleted temp audio file in: " + file_path)

def create_audio(text, filename):
def create_audio(text):
"""
Create an audio file from text
Create an audio file from text and return the path to a temporary file
"""
try:
temp_file = tempfile.NamedTemporaryFile(delete=False, suffix=".mp3")
print(f"\nCreated temp audio file in : {temp_file.name}")
try:
speech = gTTS(text=text, lang='en', slow=False)
speech.save(filename)
speech.save(temp_file.name)
except Exception as e:
print(f"Error in creating audio: {e}")
print(f"\nError in creating audio: {e}")

return temp_file.name

def generate_response(prompt):
def generate_response(prompt, speak_response=True):
openai.api_key = OPENAI_API_KEY
# # For debugging if you want to view the full data being passed
# print("\U00002753 Received question: " + str(prompt))
try:
completion = openai.ChatCompletion.create(
model="gpt-3.5-turbo",
messages=[{"role": "user", "content": prompt}]
messages=[{"role": "user", "content": prompt}],
max_tokens=MAX_TOKENS_PER_CALL,
)
print("\U0001F4B0 Tokens used:", completion.usage.total_tokens)
# print total tokens used al time
print('\U0001F916', completion.choices[0].message.content)
create_audio(completion.choices[0].message.content, "audio/response.mp3")
play_audio("audio/response.mp3")
print("\n\U0001F4B0 Tokens used:", completion.usage.total_tokens)
response_text = completion.choices[0].message.content
print('\U0001F916', response_text)
if speak_response:
audio_stream = create_audio(response_text)
play_audio("audio/response.mp3")
except Exception as e:
print(f"\U000026A0 Error in generating response: {e}")

def collect_files(handler):
# print("\U0001F4C1 Collecting files...")
handler.update_file_content()
# print("\U00002705 Initial file collection done.")

def monitor_input(handler):

def monitor_input(handler, terminal_input=True):
while True:
try:
with sr.Microphone() as source:
print("Listening...")
audio_data = r.listen(source)
text = r.recognize_google(audio_data)


if text == "exit":
print("\U0001F44B Exiting the program...")
os._exit(0)
elif text:
print(f"You said: {text}")
question = text
print("\U0001F9E0 You asked: " + question)
docs = handler.knowledge_base.similarity_search(question)
response = f"You are an expert programmer who is aware of this much of the code base:{str(docs)}. \n"
response += "Please answer this: " + question + " Keep your answer under 20 words if possible. Speak in bullet points if you can to help with conciseness. Your main priority is to answer their questions using the info provided including line numbers if possible. Also note that when you give answers please include the file path if it makes sense. If the question is not relevant or not a question simply respond with 'Skipping'. Do not include any special text like _'s or ''s as this will be read by text to speech. Only include text in the response without non character letters. Even function names with _ in them should be replaced with a space so it is more readable audibly."
ai_response = generate_response(response)
text = ""
if terminal_input:
text = input("\U00002753 Please type your question (or 'exit' to quit): ")
else:
with sr.Microphone() as source:
print("\nListening...")
audio_data = r.listen(source)
text = r.recognize_google(audio_data)

if text.lower() == 'exit':
print("\n\U0001F44B Exiting the program...")
os._exit(0)
else:
print(f"You said: {text}")
question = text
print("\n\U0001F9E0 You asked: " + question)
docs = handler.knowledge_base.similarity_search(question)
response = f"You are an expert programmer who is aware of this much of the code base:{str(docs)}. \n"
response += "Please answer this: " + question + "..." # Add the rest of your instructions here
generate_response(response, speak_response=not terminal_input)
except sr.UnknownValueError:
print("Could not understand audio")
print("\nCould not understand audio")
except sr.RequestError as e:
print("Could not request results; {0}".format(e))

print("\nCould not request results; {0}".format(e))
except Exception as e:
print(f"An error occurred: {e}")

def start_cody(ignore_list=[]):
handler = FileChangeHandler(ignore_list)
handler = FileChangeHandler(ignore_list=IGNORE_THESE)

# Collect files before starting the observer
collect_files(handler)
handler.update_file_content() # Directly call the update_file_content method

# Start a new thread to monitor user input
input_thread = threading.Thread(target=monitor_input, args=(handler,))
# Prompt user for interaction method
interaction_method = input("\nHow should I talk to you? Enter 1 for Terminal or 2 for Speech I/O: ")

terminal_input = interaction_method == '1'

# Start a new thread to monitor input
input_thread = threading.Thread(target=monitor_input, args=(handler, terminal_input))
input_thread.start()

# Initialize the observer
observer = Observer()
observer.schedule(handler, path='.', recursive=True)
observer.start()

# Continue to observe for file changes. Adding time.sleep to reduce CPU usage as well as prevent 'duplicate' file change events (false flags)
# Continue to observe for file changes
try:
while True:
time.sleep(5)
Expand All @@ -181,6 +201,5 @@ def start_cody(ignore_list=[]):

observer.join()

if __name__ == "__main__": # Set this to False to ignore .env file
ignore_list = ['static', 'dashboard/static', 'audio', 'license.md', '.github', '__pycache__']
start_cody(ignore_list)
if __name__ == "__main__":
start_cody(ignore_list=IGNORE_THESE)
Loading

0 comments on commit b2d6b6c

Please sign in to comment.