Video Dataset Maker

A pipeline covers downloading videos from YouTube and extracting frames using ffmpeg.

Our goal is to build a fast pipeline to produce new datasets for deep learning practitioners in the field of video understanding. We have already done some action recognition and video target segmentation datasets with this tool.

Video Dataset Maker

DENPENDENCY

Python Environment

Personally suggest Conda. Please follow its guidance.

yt-dlp

yt-dlp is a youtube-dl fork based on the now inactive youtube-dlc. The main focus of this project is adding new features and patches while also keeping up to date with the original project.

FFmpeg

FFmpeg is a complete, cross-platform solution to record, convert and stream audio and video.

INSTALLATION

We suggest creating a virtual environment. The python version doesn't matter.

conda create -n name python=3.x

Activate the virtual environment you created before installing the following things and using the script.

Install yt-dlp with pip. Follow the guidance if there is any problem with that.

python3 -m pip install -U yt-dlp

Install other packages with pip by using the command:

pip install -r requirements.txt

Download FFmpeg with your version. You can also retrieve the source code through Git by using the command:

git clone https://git.ffmpeg.org/ffmpeg.git ffmpeg

GET STARTED

Store the target url list in a .txt file as example.

Run the script main.py with arguments input.

python main.py --url ${URL FILE}\
              --output ${OUTPUT PATH}\
              --suffix ${SUFFIX}\
              --h ${H}\
              --w ${W}

python main.py --url ./example.txt --output ./output --suffix .png --h 360 --w 640-

The dataset will be organized as:

├── output
   ├── 0
   │   ├── 0001.png
   │   ├── 0002.png
   │   ├── 0003.png
   │   ├── 0004.png
   │   ├── ...
   │   └── 0116.png
   ├── 1
   │   ├── 0001.png
   │   ├── 0002.png
   │   ├── 0003.png
   │   ├── 0004.png
   │   ├── ...
   │   └── 0116.png
   ├── 0.mp4
   └── 1.mp4

YouTube-VOS Annotation Format

For YouTube-VOS dataset, this repo provide the tool to convert the frames and masks to the given format, which is

train.zip
    |- JPEGImages
        |- <video_id>
            |- <frame_id>.jpg
            |- <frame_id>.jpg
        |- <video_id>
            |- <frame_id>.jpg
            |- <frame_id>.jpg
    |- Annotations
        |- <video_id>
            |- <frame_id>.png
            |- <frame_id>.png
        |- <video_id>
            |- <frame_id>.png
            |- <frame_id>.png

meta.json
    {
        "videos": {
            "<video_id>": {
                "objects": {
                    "<object_id>": {
                        "category": "<category>", 
                        "frames": [
                            "<frame_id>", 
                            "<frame_id>", 
                        ]
                    }
                }
            }
        }
    }
# <object_id> is the same as the pixel values of object in annotated segmentation PNG files.
# <frame_id> is not necessary to start from 0.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
download_video.py		download_video.py
example.txt		example.txt
mask_rename.py		mask_rename.py
requirements.txt		requirements.txt
util.py		util.py
youtubevos_annotation.py		youtubevos_annotation.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Video Dataset Maker

DENPENDENCY

Python Environment

yt-dlp

FFmpeg

INSTALLATION

GET STARTED

YouTube-VOS Annotation Format

About

Releases

Packages

Contributors 2

Languages

License

rese1f/video-dataset-maker

Folders and files

Latest commit

History

Repository files navigation

Video Dataset Maker

DENPENDENCY

Python Environment

yt-dlp

FFmpeg

INSTALLATION

GET STARTED

YouTube-VOS Annotation Format

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages