An annotated dataset of YouTube videos designed as a benchmark for Fine-grained Incident Video Retrieval. The dataset comprises 225,960 videos associated with 4,687 Wikipedia events and 100 selected video queries.
Project Website: [link]
Paper: [publisher] [arXiv] [pdf]
- Clone this repo:
git clone https://github.com/MKLab-ITI/FIVR-200K
cd FIVR-200K
- You can install all the dependencies by
pip install -r requirements.txt
or
conda install --file requirements.txt
- Install yt-dlp (make sure it is up-to-date)
-
The files that contains the dataset can be found in dataset folder
-
The video annotations are in file annotation.json that has the following format:
{
"5MBA_7vDhII": {
"ND": [
"_0uCw0B2AgM",
...],
"DS": [
"hc0XIE1aY0U",
...],
"CS": [
"ydEqiuDiuyc",
...],
"IS": [
"d_ZNjE7B4Wo",
...],
"DA": [
"rLvVYdtc73Q",
...],
},
....
}
- The events crawled from Wikipedia's Current Event page are in file events.json that has the following format:
[
{
"headline": "iraqi insurgency",
"topic": "armed conflict and attack",
"date": "2013-01-22",
"text": [
"car bombings in baghdad kill at least 17 people and injure dozens of others."
],
"href": [
"http://www.bbc.co.uk/news/world-middle-east-21141242",
"https://www.reuters.com/article/2013/01/22/us-iraq-violence-idUSBRE90L0BQ20130122"
],
"youtube": [
"ZpjqUq-EnbQ",
...
]
},
...
]
-
The Youtube IDs of the videos in the dataset are in file youtube_ids.txt
-
The global features of the benchmarked approaches in the paper can be found here
- The ordering is according to the youtube_ids.txt
- Run the following command to download videos:
python download_dataset.py --video_dir VIDEO_DIR [--dataset_ids DATASET_FILE] [--cores NUMBER_OF_CODES] [--resolution RESOLUTION]
- An example to run the download script:
python download_dataset.py --video_dir ./videos --dataset_ids dataset/youtube_ids.txt --cores 4 --resolution 360
-
Videos will be saved in the following directory structure
VIDEO_DIR/YT_ID.mp4
-
The videos that are no longer available are stored in a text file with name
missing_videos.txt
-
Generation of the result file
-
A file that contains a dictionary with keys the YT ids of the query videos and values another dictionary with keys the YT ids of the dataset videos and values their similarity to the query.
-
Results can be stored in a JSON file with the following format:
{ "wrC_Uqk3juY": { "KQh6RCW_nAo": 0.716, "0q82oQa3upE": 0.300, ...}, "k_NT43aJ_Jw": { "-KuR8y1gjJQ": 1.0, "Xb19O5Iur44": 0.417, ...}, .... }
- An implementation for the generation of the JSON file can be found here
-
-
Evaluation of the results
- Run the following command to run the evaluation:
python evaluation.py --result_file RESULT_FILE --relevant_labels RELEVANT_LABELS
- An example to run the evaluation script:
python evaluation.py --result_file ./results/lbow_vgg.json --relevant_labels ND,DS
- Add flag
--help
to display the detailed description for the arguments of the evaluation script
-
Evaluation on the three retrieval task
- Provide different values to the
relevant_labels
argument to evaluate your results for the three visual-based retrieval task
DSVR: ND,DS CSVR: ND,DS,CS ISVR: ND,DS,CS,IS
- For the Duplicate Audio Video Retrieval (DAVR) task provide
DA
to therelevant_labels
argument
- Provide different values to the
-
Reported results
- To re-produce the results of the paper run the following command
bash evaluate_run.sh APPROACH_NAME FEATURES_NAME
- An example to run the evaluation script:
bash evaluate_run.sh BOW VGG
-
The results will probably not be the same as the reported one in the paper, because we are constantly fixing mislabeled videos that were missed during the annotation process.
-
See in the Updates section when was the last update of the dataset's annotation
In case that you find a mislabeled video please submit it to the following form here
- Update 21/1/21: add
DA
label for audio-based annotations of duplicate audio videos - Update 29/5/19: fix labels for 373 videos
If you use FIVR-200K dataset for your research, please consider citing our paper:
@article{kordopatis2019fivr,
title={{FIVR}: Fine-grained Incident Video Retrieval},
author={Kordopatis-Zilos, Giorgos and Papadopoulos, Symeon and Patras, Ioannis and Kompatsiaris, Ioannis},
journal={IEEE Transactions on Multimedia},
year={2019}
}
If you use the audio-based annotations, please also consider citing our paper:
@inproceedings{avgoustinakis2020ausil,
title={Audio-based Near-Duplicate Video Retrieval with Audio Similarity Learning},
author={Avgoustinakis, Pavlos and Kordopatis-Zilos, Giorgos and Papadopoulos, Symeon and Symeonidis, Andreas L and Kompatsiaris, Ioannis},
booktitle={Proceedings of the IEEE International Conference on Pattern Recognition},
year={2020}
}
Intermediate-CNN-Features - this repo was used to extract our CNN features
NDVR-DML - one of the methods benchmarked in the FIVR-200K dataset
ViSiL - video similarity learning for fine-grained similarity calculation
AuSiL - audio similarity learning for audio-based similarity calculation
This project is licensed under the Apache License 2.0 - see the LICENSE file for details
Giorgos Kordopatis-Zilos (georgekordopatis@iti.gr)
Symeon Papadopoulos (papadop@iti.gr)