This project was written to test audio analysis methods in Python. To create data to work with, it starts by scraping the audio from YouTube videos. This is a work in progress. At this point, we can scrape audio from the most popular videos in different YouTube video categories. Audio mining yet to be completed.
-
The Python package used to scrape audio from YouTube videos (
youtube-dl
) has software dependencies. We used these steps to get set up:- Download youtube-dl from: https://yt-dl.org/
- Define youtube-dl.exe's location in system PATH environment variable.
- Download LIBAV from this link and copy the .exe and all DLL files to the location of the youtube-dl.exe: http://builds.libav.org/windows/release-gpl/
-
Set up a YouTube API key at: https://console.cloud.google.com/
-
After getting software dependencies and an API key taken care of, we stored our API key in a .env file. Replace the file path in the
load_dotenv()
command to your own .env location in thedownload_audio_from_youtube.py
file to run it. -
Install Python dependencies with
pip install -r requirements.txt
- Use PyAudioAnalysis to extract features from audio files
- Use features for classification