IntraVideo Search is a search engine that takes a video source and search terms as inputs and produces short video clips as outputs. Using ML image classification APIs, we can search through a video and retrieve only the parts relevant to the user-specified search terms.
These are hard requirements, and the program will not work without these.
- Anaconda (Python 3.7)
- Linux (Ubuntu 18.04)
- an internet connection (for API calls and YouTube downloads)
Click the links above to download and install the software. If you are having trouble or would rather not, please take advantage of CSIL's Linux room. Those computers already have Ubuntu and Anaconda installed. If you are not sure whether Anaconda is installed, please ask CSIL staff for help in confirming. You can even ask them for help in making a new virtual environment (which is described in the next step).
To get set up, please read the Guide to Git.
If all prerequisites are met, follow these instructions to clone the repo and install the necessary Python packages. From this point on, we will assume python
is the command for your Anaconda Python 3.7 distribution.
Here is a YouTube link to an installation guide for the repo in case you get stuck:
https://www.youtube.com/watch?v=SulPm5PEae8&feature=youtu.be
Feel free to use the resources at CSIL as well to get started, including CSIL staff.
- Clone the repository:
$ git clone https://github.com/suchak1/intravideo_search.git
- Navigate into the main directory (
intravideo_search/
):
$ cd intravideo_search
- Create a new (fresh) Python 3.7 virtual environment (called
intra
in this case) for Anaconda:
$ conda create -n intra python=3.7
- Activate the new environment:
$ conda activate intra
To make sure the environment switched to intra
, you can do conda env list
and make sure the star is on the same row as intra
. If not close bash, and try Step 4 again.
-
Confirm that running
python -V
yields Python 3.7. If not, make sure you specified the right Python version in Step 3 and have Anaconda for Python 3.7. -
Install the necessary packages by running the following command:
$ pip install -r requirements.txt
If the installation hangs for more than 10 min, cancel the command (Ctrl + C) and try again or ask staff at CSIL for help installing packages if you are at Crerar.
If there is an error installing torch
specifically, use this command:
$ pip install torch===1.3.1 torchvision===0.4.2 -f https://download.pytorch.org/whl/torch_stable.html
then install the rest of the requirements as necessary.
Note: Make sure you specify the right python version when you make these commands if you have multiple python installations, i.e. check to make sure python -V
yields Python 3.7 otherwise your relevant command may be python3
.
To remake the requirements.txt
file, run:
pipreqs ./ --force
- To run all tests, first make the test suite executable (bash file).
Then, run
chmod +x test_suite.sh
PYTHON=python ./test_suite.sh
- where
python
is your python 3 installation (might bepython3
if you have more than one installation)
- where
- Note: to run a single test file, just append the test file path like so
and to run a single function, add the
python -m pytest test/test_controller.py -vv
-k
flag and the function name:This might be necessary since functions likepython -m pytest test/test_controller.py -k "test_classify_img" -vv
classify_img
take a while because loading a ML model for the first time is expensive.
Note: Make sure you specify the right python version when you make these commands if you have multiple python installations, ie python3
.
To maintain consistent code style, consider executing the command:
autopep8 -i -r ./
This will automatically update/rewrite all Python code in the entire repo to follow PEP8 style guidelines.
(Make sure you run the program and any utilities in the main dir (intravideo_search/
).
One limitation of the application stems from using free, pretrained ML models that are only trained on 1000 objects. This means that our program only classifies objects and is specifically good at animals. We try to match your search term to the model's classification labels, but we do not always do so perfectly, or many a time 1000 objects is simply too small a sample of the English language to describe your search term. If you search for beach
, we will also automatically search for seashore
, sea
, coast
, coastline
, etc on your behalf. However, sometimes this fails.
For example, if you search for waterfall
in the test/COSTA_RICA.mp4
video included in the repo, then you may not get any results. However if you search for fountain
or spring
, you should get multiple clips of waterfalls/fountains. To be sure, you can query the blackbox, and receive a "model-safe" word or phrase for a search term you want to use.
Run the following to get started:
python src/analyze.py
Here is the waterfall / fountain example:
Note: This feature requires an internet connection just like rest of the application. Now, back to the main GUI application!
Run $ python src/start.py
to begin. Then the following steps can be taken from within the GUI.
-
Select a video by:
- entering a YouTube video URL and click
Add
to add the video to the pending Job. To clear the video from the Job, clickClear
. Note: While hypothetically you can enter any YouTube video you'd like, keep in mind that entering an extremely long video (e.g. baby shark 10 hour version) will likely take a long time to download and a long time to classify all the images. YouTube at your own risk. - clicking
Browse
and choosing a local video on your hard drive.
You will see your selected video on the right in the Settings box.
- entering a YouTube video URL and click
-
Enter a search term (or multiple search terms separated by commas — whitespace doesn't matter).
-
Adjust the settings by moving the sliders in the Settings box.
- Confidence (%): Refers to the confidence score given by the ML classifier (i.e. how confident the classifier is that your searched term appears within a given image). The program will only return clips that have at least the confidence level given in the settings.
A high confidence level will likely result in fewer false positives, but also fewer clips. A low confidence level will likely result in more false positives, but more clips.
- Poll Rate (sec): Refers to the frequency with which the program will pull a frame to check for searched items (e.g. a poll rate of 5 seconds will check for any searched items every 5 seconds). The poll rate also determines the length of the resulting clips.
A low poll rate will likely result in more precise clip lengths and classifications, but also longer run time. A high poll rate will likely result in less precise clip lengths and classifications, but also shorter run time.
-
To enable multiprocessing (CPU intensive concurrency), enable the multiprocessing checkmark. This will drastically speed up the Job but will also eat up CPU resources.
-
Press
Start Job
to start a Job with the specified settings and inputs (video and search terms). You can cancel at any time as theStart Job
button will automatically become aCancel
button. -
When the Job, the Log at the bottom of the window will output whether any relevant clips were found and how many. These clips will be saved in the source video's original filepath.
Extra:
- Press
Choose Clip
to choose a video clip to caption. This action is available even while a Job is running.
Note: Multiprocessing/parallelized functions allow for quicker searches, otherwise the application will use slow/non-parallel versions of our functions. To speed up the processes but achieve less accurate results, increase the polling rate.
Here are 3 acceptance tests:
First, run python src/start.py
to start GUI. Then, choose the test video in test/COSTA_RICA.mp4
using the Upload
button. Input each of the following scenarios, and click Start. The output should match the expected outputs below.
- Test 1
-
Input:
- Confidence: 30%
- Polling Rate: 2 sec
- Anti: 5 sec (default)
- Search Terms: beach
-
Output:
- 3 clips in
test/
folder all of the beach
- 3 clips in
-
- Test 2
-
Input:
- Confidence: 50%
- Polling Rate: 2 sec
- Anti: 5 sec (default)
- Search Terms: fountain
-
Output:
- 3 clips in
test/
1st and 3rd are waterfalls, 2nd clip is beach
- 3 clips in
-
- Test 3
-
Input:
- Confidence: 80%
- Polling Rate: 2 sec
- Anti: 5 sec (default)
- Search Terms: frog
-
Output:
- 5 clips in
test/
folder first 2 are frogs, middle is komodo dragon, last 2 are snakes
- 5 clips in
-
The following are known functionality and not bugs.
-
Log updates for most actions a user takes. For example, starting a job, saving clips (including number of videos).
-
Clips save automatically to video's original source path (path at the time of starting the job).
-
The progress bar does not measure actual numerical progress, only that a job is still running.
-
Some search term inputs or confidence levels can yield no clips (which will be relayed on the GUI log - "No relevant clips found"). You can use the black box utility in
src/analyze.py
(explained in the section above) to use more model-friendly search terms. Also, output on your bash terminal will hint at which confidence levels the program finds your search terms - so you can adjust on the next run. -
Large video files may fill up your memory and cause the program to crash, so only videos your machine can handle are recommended. If your computer has insufficient memory, use the CSIL machines.
-
To find related words (words that relate to your search terms), we use an API called Datamuse. Datamuse allows for 100,000 requests per day by a given IP address. Something to keep in mind - the program will probably stop working if you exceed this limit.
-
If you cancel a job, make sure you wait for all output of your bash terminal to stop. (This may include "Broken pipes" if you cancel during the classification stage as we are trying shutdown multiple subprocessing using multiple copies of Tensorflow models all at once as gracefully as we can.) Not doing so will probably not affect classification for the next job, but may lead to memory leaks causing system instability.
-
Sometimes YouTube videos will not download. The GUI will display an error message on the Log and explain that this is a network error or isssue with
pytube
. This could be an issue with a specific video or wireless connection. We would suggest trying to download at CSIL or trying a different video. -
Sometimes the clip captioner gives nonsense captions. This is a limitation of the pretrained model.
We have also included a Jupyter notebook, so you can part of the backend dynamically.
Simply run jupyter notebook
(make sure you have it installed, not a Python package) in the main directory and select classify_img.ipynb
in your browser. Click Cell in the taskbar/menu and Run All. Now, note that the notebook successfully recognizes a goldfish with 99% confidence. Feel free to input/replace a URL of your choice to test object detection.
docs/
- markdown files for software dev
pics/
- any screenshots used in markdown files
src/
- source code for the project
-
model.py
- main driver -
view.py
- GUI -
controller.py
- workhorse
utils/
- various scripts to help in dev work
update.py
- updates all packages thatpip
considers "outdated"dev_dep.py
- lists dev dependencies (packages that aren't explicitly used in source code but you believe the project needs) - this allows therequirements.txt
to be complete when we remake it withpipreqs
.travis.yml
- build pipeline
This project is licensed under the MIT License - see the LICENSE.md file for details.