GitHub - AlexD4110/AI-Project: This simulates a video and audio aware model using existing LLM vision models. (It takes images and text as input, and generates text as output. Using models like whisper, the text can "speak".

AlexD4110 / AI-Project Public

Notifications You must be signed in to change notification settings
Fork 0
Star 1

This simulates a video and audio aware model using existing LLM vision models. (It takes images and text as input, and generates text as output. Using models like whisper, the text can "speak".

1 star 0 forks Branches Tags Activity

Star

Notifications

Branches Tags

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
.ipynb_checkpoints		.ipynb_checkpoints
__pycache__		__pycache__
.DS_Store		.DS_Store
.gitignore		.gitignore
Nicholas.mov		Nicholas.mov
app.py		app.py
app_bakeoff.py		app_bakeoff.py
datauri.py		datauri.py
input.mov		input.mov
input1.MOV		input1.MOV
media_extractor.py		media_extractor.py
query.py		query.py
query_bakeoff.py		query_bakeoff.py
requirements.txt		requirements.txt
system_prompt.txt		system_prompt.txt
tempCodeRunnerFile.py		tempCodeRunnerFile.py
test_app.py		test_app.py

About

This simulates a video and audio aware model using existing LLM vision models. (It takes images and text as input, and generates text as output. Using models like whisper, the text can "speak".

Activity

1 star

1 watching

0 forks

Report repository