Merge pull request #3 from ajhous44/feature/inputconsolidation

Cody Consolidation + Rework
ajhous44 · Aug 13, 2023 · b2d6b6c · b2d6b6c
2 parents 2218109 + 5e1e7f0
commit b2d6b6c
Show file tree

Hide file tree

Showing 7 changed files with 85 additions and 204 deletions.
diff --git a/.gitignore b/.gitignore
@@ -1 +1,2 @@
 .venv
+.env
diff --git a/.local.env b/.local.env
@@ -0,0 +1 @@
+OPENAI_API_KEY=YOUR_API_KEY_HERE
diff --git a/README.md b/README.md
@@ -12,17 +12,17 @@ Cody continuously updates its knowledge base every time you save a file, ensurin
 
 ## 🚀 Getting Started
 
-1. Set the environment variable `OPENAI_API_KEY` in a `.env` file with your OpenAI API key.
-2. Modify the `ignore_list` in the `if __name__ == "__main__":` section of the script to specify directories and files you wish to exclude from monitoring.
-3. Run the script using Python: python codyv4.py or codyv5.py
-
-   4. Once the script is running, type 'Q' and press enter to switch to question mode. Cody is ready to answer your queries!
+1. Clone the repo
+2. (Optionally) Setup virtual environment by running `pip install -m venv .venv` and then `pip install -r requirements.txt` in terminal from the root of your directory
+3. Rename the `.local.env` file to `.env`` and replace `YOUR_API_KEY_HERE` with your OpenAI API Key.
+4. Modify the `IGNORE_THESE` global var at the top of the script to specify directories and files you wish to exclude from monitoring. (You should comment out any large files like a virtual environment, cache, js libraries you have downloaded, etc...)
+5. Run the script using Python: python cody.py and follow terminal for setup. It will prompt you for if you want to use text chat (terminal) or conversational (speech i/o). It will also warn you if you remove .env from the ignore list.
 
 ## 🎯 Features
 
 - **File Monitoring**: Real-time monitoring of all files in your project's directory and subdirectories. 👀
 - **Embedding-based Knowledge Base**: Create a knowledge base using OpenAI Embeddings. Cody collects the contents of all text and JSON files and adds them to this knowledge base. 📚
-- **Interactive Querying**: Listen to user inputs. Ask questions, and Cody will generate a response using the knowledge base. 🧠
+- **Interactive Q&A**: Listen to user inputs. Ask questions, and Cody will generate a response using the knowledge base. 🧠
 - **Customizable**: Easily specify files or directories to ignore during monitoring.
 
 ## 🛠 Dependencies
@@ -37,17 +37,16 @@ Cody continuously updates its knowledge base every time you save a file, ensurin
 
 ## 💡 Usage
 
-- For version V4 (Text Interaction), type 'Q' and press enter. Cody will prompt you to input your question. Once you've entered your query, Cody will generate a response based on its knowledge base. Use Cody to debug code, troubleshoot errors, ask for help in adding new features, understand how functions interact across files, and more.
-
-- For version V5 (Voice Interaction), simply speak to Cody, and it will respond accordingly. Cody is here to assist you with various programming tasks, making it a valuable tool in your coding journey.
-
-To stop the script, type 'exit' or speak the word 'exit' and press enter. Cody will gracefully terminate the program.
+- To stop the script, type 'exit' or speak the word 'exit' and press enter. Cody will gracefully terminate the program.
 
 ## ⚠️ Notes & Tips
 
 - Cody uses the FAISS library for efficient similarity search in storing vectors. Please ensure you have sufficient memory available, especially when monitoring a large number of files.
 - Additionally, be sure to monitor your OpenAI api usage. A helpful tip is to set a monthly spend limit inside of your OpenAI account to prevent anything crazy from happening. As an additional helper, it prints the number of tokens used in each call you make.
 - "LIVE" coding questions. To use to it's full potential. I recommend opening a seperate terminal or even command prompt cd'ing into your project directory, and then launching python cody.py. Then place it split screen with your code in a small viewing window on the far left or right. This way, you can use a seperate terminal for actually running your code without worrying about Cody or having to run him (er... it) each time! This will still continue to update with each file save you do on any file so it always is using the latest data.
 
+## Contributing
+
+Contributions are welcome. Please submit a pull request or open an issue for any bugs or feature requests.
 
 Happy Coding with Cody! 💡🚀🎉
diff --git a/audio/response.mp3 b/audio/response.mp3
diff --git a/codyV5.py → cody.py b/codyV5.py → cody.py
@@ -4,6 +4,7 @@
 from langchain.vectorstores import FAISS
 from watchdog.observers import Observer
 from watchdog.events import FileSystemEventHandler
+import tempfile
 import json
 import time
 import threading
@@ -15,6 +16,11 @@
 
 load_dotenv()
 OPENAI_API_KEY = os.getenv("OPENAI_API_KEY")
+
+### USER OPTIONS ###
+### MAX TOKENS PER CALL: MAX TOKENS TO USE FOR CALL
+MAX_TOKENS_PER_CALL = 2500 # MAX TOKENS TO USE FOR CALL
+IGNORE_THESE = ['.venv', '.env', 'static', 'dashboard/static', 'audio', 'license.md', '.github', '__pycache__']
 r = sr.Recognizer()
 
 class FileChangeHandler(FileSystemEventHandler):
@@ -38,12 +44,18 @@ def should_ignore(self, filename):
 	def on_modified(self, event):
 		if "response.mp3" not in event.src_path:
 			if not self.should_ignore(event.src_path):
-				print(f'\n \U0001F4BE The file {event.src_path} has changed!')
+				print(f'\n\U0001F4BE The file {event.src_path} has changed!')
 				self.update_file_content()
 
 	def update_file_content(self):
-		print("\U0001F4C1 Collecting files...")
+		print("\n\U0001F4C1 Collecting files...")
 		all_files_data = {}
+		# Check if ".env" is in ignore list, if not prompt warning "Are you sure you want to include your .env in your api call to OpenAI?"
+		if ".env" not in self.ignore_list:
+			response = input("😨 You removed .env from ignore list. This may expose .env variables to OpenAI. Confirm? (1 for Yes, 2 for exit):")
+			if response != "1":
+				print("\n😅 Phew. Close one... Operation aborted. Please add '.env' to your ignore list and try again.")
+				exit()
 		for root, dirs, files in os.walk('.'):
 			# Remove directories in the ignore list
 			dirs[:] = [d for d in dirs if d not in self.ignore_list]
@@ -54,6 +66,7 @@ def update_file_content(self):
 						with open(file_path, 'r') as file:
 							if filename.endswith('.json'):
 								json_data = json.load(file)
+								all_files_data[file_path] = json_data  # Store JSON data in the dictionary
 							else:
 								lines = file.readlines()
 								line_data = {}
@@ -81,8 +94,8 @@ def update_file_content(self):
 		self.knowledge_base = FAISS.from_texts(chunks, self.embeddings)
 
 		print("\U00002705 All set!")
-		create_audio("Files updated. Ready for questions", "audio/response.mp3")
-		play_audio("audio/response.mp3")
+		audio_stream = create_audio("Files updated. Ready for questions")
+		play_audio(audio_stream)
 
 def play_audio(file_path):
 	"""
@@ -96,83 +109,90 @@ def play_audio(file_path):
 		continue
 
 	pygame.mixer.music.unload()
+	os.unlink(file_path)  # Delete the temporary file
+	print("Deleted temp audio file in: " + file_path)
 
-def create_audio(text, filename):
+def create_audio(text):
 	"""
-	Create an audio file from text
+	Create an audio file from text and return the path to a temporary file
 	"""
-	try: 
+	temp_file = tempfile.NamedTemporaryFile(delete=False, suffix=".mp3")
+	print(f"\nCreated temp audio file in : {temp_file.name}")
+	try:
 		speech = gTTS(text=text, lang='en', slow=False)
-		speech.save(filename)
+		speech.save(temp_file.name)
 	except Exception as e:
-		print(f"Error in creating audio: {e}")
+		print(f"\nError in creating audio: {e}")
+
+	return temp_file.name
 
-def generate_response(prompt):
+def generate_response(prompt, speak_response=True):
 	openai.api_key = OPENAI_API_KEY
-	# # For debugging if you want to view the full data being passed
-	# print("\U00002753 Received question: " + str(prompt))
 	try:
 		completion = openai.ChatCompletion.create(
 		model="gpt-3.5-turbo", 
-		messages=[{"role": "user", "content": prompt}]
+		messages=[{"role": "user", "content": prompt}],
+		max_tokens=MAX_TOKENS_PER_CALL,
 		)
-		print("\U0001F4B0 Tokens used:", completion.usage.total_tokens)
-		# print total tokens used al time
-		print('\U0001F916', completion.choices[0].message.content)
-		create_audio(completion.choices[0].message.content, "audio/response.mp3")
-		play_audio("audio/response.mp3")
+		print("\n\U0001F4B0 Tokens used:", completion.usage.total_tokens)
+		response_text = completion.choices[0].message.content
+		print('\U0001F916', response_text)
+		if speak_response:
+			audio_stream = create_audio(response_text)
+			play_audio("audio/response.mp3")
 	except Exception as e:
 		print(f"\U000026A0 Error in generating response: {e}")
 
-def collect_files(handler):
-	# print("\U0001F4C1 Collecting files...")
-	handler.update_file_content()
-	# print("\U00002705 Initial file collection done.")
-
-def monitor_input(handler):
-
+def monitor_input(handler, terminal_input=True):
 	while True:
 		try:
-			with sr.Microphone() as source:
-				print("Listening...")
-				audio_data = r.listen(source)
-				text = r.recognize_google(audio_data)
-
-
-				if text == "exit":
-					print("\U0001F44B Exiting the program...")
-					os._exit(0)
-				elif text:
-					print(f"You said: {text}")
-					question = text
-					print("\U0001F9E0 You asked: " + question)
-					docs = handler.knowledge_base.similarity_search(question)
-					response = f"You are an expert programmer who is aware of this much of the code base:{str(docs)}. \n"
-					response += "Please answer this: " + question + " Keep your answer under 20 words if possible. Speak in bullet points if you can to help with conciseness. Your main priority is to answer their questions using the info provided including line numbers if possible. Also note that when you give answers please include the file path if it makes sense. If the question is not relevant or not a question simply respond with 'Skipping'. Do not include any special text like _'s or ''s as this will be read by text to speech. Only include text in the response without non character letters. Even function names with _ in them should be replaced with a space so it is more readable audibly."
-					ai_response = generate_response(response)
-					text = ""
+			if terminal_input:
+				text = input("\U00002753 Please type your question (or 'exit' to quit): ")
+			else:
+				with sr.Microphone() as source:
+					print("\nListening...")
+					audio_data = r.listen(source)
+					text = r.recognize_google(audio_data)
+
+			if text.lower() == 'exit':
+				print("\n\U0001F44B Exiting the program...")
+				os._exit(0)
+			else:
+				print(f"You said: {text}")
+				question = text
+				print("\n\U0001F9E0 You asked: " + question)
+				docs = handler.knowledge_base.similarity_search(question)
+				response = f"You are an expert programmer who is aware of this much of the code base:{str(docs)}. \n"
+				response += "Please answer this: " + question + "..." # Add the rest of your instructions here
+				generate_response(response, speak_response=not terminal_input)
 		except sr.UnknownValueError:
-			print("Could not understand audio")
+			print("\nCould not understand audio")
 		except sr.RequestError as e:
-			print("Could not request results; {0}".format(e))
-
+			print("\nCould not request results; {0}".format(e))
+		except Exception as e:
+			print(f"An error occurred: {e}")
 
 def start_cody(ignore_list=[]):
-	handler = FileChangeHandler(ignore_list)
+	handler = FileChangeHandler(ignore_list=IGNORE_THESE)
 
 	# Collect files before starting the observer
-	collect_files(handler)
+	handler.update_file_content()  # Directly call the update_file_content method
 
-	# Start a new thread to monitor user input
-	input_thread = threading.Thread(target=monitor_input, args=(handler,))
+	# Prompt user for interaction method
+	interaction_method = input("\nHow should I talk to you? Enter 1 for Terminal or 2 for Speech I/O: ")
+
+	terminal_input = interaction_method == '1'
+
+	# Start a new thread to monitor input
+	input_thread = threading.Thread(target=monitor_input, args=(handler, terminal_input))
 	input_thread.start()
 
 	# Initialize the observer
 	observer = Observer()
 	observer.schedule(handler, path='.', recursive=True)
 	observer.start()
 
-	# Continue to observe for file changes. Adding time.sleep to reduce CPU usage as well as prevent 'duplicate' file change events (false flags)
+	# Continue to observe for file changes
 	try:
 		while True:
 			time.sleep(5)
@@ -181,6 +201,5 @@ def start_cody(ignore_list=[]):
 
 	observer.join()
 
-if __name__ == "__main__":  # Set this to False to ignore .env file
-	ignore_list = ['static', 'dashboard/static', 'audio', 'license.md', '.github', '__pycache__']
-	start_cody(ignore_list)
+if __name__ == "__main__":
+	start_cody(ignore_list=IGNORE_THESE)