Search the transcripts of any channel on YouTube for keywords or phrases in your terminal with python.
Run the following to clone this repo:
git clone https://github.com/olincollege/youtube-transcript-search
Install the packages used in this project by running the following:
pip install -r requirements.txt
- Create a free Google Cloud project at [https://console.cloud.google.com/projectcreate]
- Enable the YouTube Data API V3 for your project at [https://console.cloud.google.com/apis/library/youtube.googleapis.com]
- Create an API key for your project at [https://console.cloud.google.com/apis/credentials]
- Copy API key to keyboard
- In root directory of repo, add a file named
.env
and add the line:YOUTUBE_API_KEY=<my-api-key>
replacing<my-api-key>
with the key that you copied.
Navigate to the repository directory in terminal and run
python run_transcript_search.py
or python3 run_transcript_search.py
.
The program requires transcript data to be downloaded locally before it can search. Follow the prompts to either search existing channels or download a new one (Note: make sure channel names are spelled exactly as they appear on YouTube). Channels with a lot of data may take some time to download for the initial search.
To search, enter comma separated keywords or phrases. These strings (with various versions of capitalization) will be searched for in the transcript data.
Results are scored based on the number of occurrences of all keywords within a video. Only the top 5 videos are displayed by default, however this can be easily modified in the draw_results
method under the ViewTerminal
class.
When a new channel's transcript data is downloaded, a new directory with the channel name is created under transcript_data
. To delete a channel from local memory, simply delete the channel's directory. Do not delete transcript_data
itself.
This branch, main
, is not configured to run tests on our code with pytest. It is identical to testing
at its core, but testing
comes with a few sets of channel data already downloaded, along with the pytest files themselves. To run the tests, run the following in the root of the testing
branch.
pytest *.py
For the primary release version of the program, please use main
.