Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Training time with Racoon dataset and training on google Colab #442

Open
phatsp opened this issue Dec 26, 2019 · 1 comment
Open

Training time with Racoon dataset and training on google Colab #442

phatsp opened this issue Dec 26, 2019 · 1 comment

Comments

@phatsp
Copy link

phatsp commented Dec 26, 2019

Hi, I am a newbie and I am trying to practice yolo2 from your git. As I am not familiar with Command line window, I did every thing on Google Colab. I have some problems and want to ask you guys for help

  1. In the outline of steps there is a part
    Run this script to convert annotations in COCO format to VOC format + https://gist.github.com/chicham/6ed3842d0d2014987186#file-coco2pascal-py
    I really don't know how to properly do this with so many .py files can configurations so I try with Raccoon dataset (which is already in VOC format). However, there are the github contain images and annotation folders along with many .py files and I again dont know how to use them. So instead I downloaded the Raccoon data set, splitted them manually and then uploaded back onto my colab directory. It really took a lot of time, where can I learn about this ?

  2. About Raccoon dataset, there are only 200 images in total. But when I copy YOLO-step-by-step.ijpynb, fix the classes and the label, then train for 20 epoch, it took about 7m30 for every epoch. I would like to ask is it normal.

  3. For some .py files, there seem to be the main_(args) function. I don't really know how to use them as well as their arguments. If possible can someone explain it how to run the code in colab

If anyone has a Colab version of the implementation, I would be so greatful if I could get a copy of it.
This may seem basic stupid questions but I have spent almost a week on them and couldn't find the answer.

@ngrayluna
Copy link

  1. I trained on my own image data set. And as such I manually labels all my images using labelImg.  After I labelled them I wrote a script to
    split the images and their associated annotations into a training and validation set. It creates and stores images and their associated annotations into separate folders (four in total). You can find my script here:

To run it you'll need to run it in terminal like:

python3 02_split_dataset.py ./path_to_boundary_box_annotations

e.g.

python3  02_split_dataset.py  ./images/annotations/

  1. Where you using a GPU on Google Collab? If it was running on CPUs alone, then yes, this could be normal. Deep Learning Networks can be slow and/or nearly impossible to train if your data set is large and/or you have a lot of weights to train.

  2. There are  two things going on with the 'main(args)': the execution of the main function and the parameter 'args'. There is a solid article about the main function you can find here:. The 'args' input function parameters comes from argparse module which is Python's goto module for making a python script 'command line friendly'.  An example of using argparse can be seen in the example above where I wrote in my script to ask the user to specify the path to my annotations folder. 

Long story-short: If you are using Jupyter Notebooks or Google Colab you do not need to have the argparse module in your notebook. You'll need to make sure that the original script you take the code does not require external input from a config. file and/or command line input. 

Hope this helps! I'll follow-up with a Google Colab. notebook if I write one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants