Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What is the meaning of split #32

Open
Felix1014 opened this issue Feb 21, 2021 · 4 comments
Open

What is the meaning of split #32

Felix1014 opened this issue Feb 21, 2021 · 4 comments

Comments

@Felix1014
Copy link

Dear authors,

What is the meaning of split? There are many .bundle files in the /data/split, and can you explain this for me?

Thank you!

@yabufarha
Copy link
Owner

Hi @Felix1014 ,

Split refers to the way the videos from the datasets are divided into training and testing sets. For a relatively small dataset, more than one split are usually used to evaluate the model and the final result is the average of the performance over all splits.

I hope this would help.

@Felix1014
Copy link
Author

Thank you for your reply. But I am still confused. Specifically, When I run main.py --action=train --dataset=DS --split=SP, if SP=1 or 2, how is the dataset divided into the training set and test set, and what do these .bundle files mean? And what is the relationship between SP and the .bundle files in data/50salads/splits. I really do not understand this.

image
Besides, How can I generate these .bundle files if I want to use another dataset?

Finally, if a dataset has standard training and test sets and we do not need to adopt cross-validation. What can I do to revise the codes?

Hi @Felix1014 ,

Split refers to the way the videos from the datasets are divided into training and testing sets. For a relatively small dataset, more than one split are usually used to evaluate the model and the final result is the average of the performance over all splits.

I hope this would help.
Thank you for your reply. But I am still confused. Specifically, When I run main.py --action=train --dataset=DS --split=SP, if SP=1 or 2, how is the dataset divided into the training set and test set, and what do these .bundle files mean? And what is the relationship between SP and the .bundle files in data/50salads/splits. I really do not understand this.

image
Besides, How can I generate these .bundle files if I want to use another dataset?

Finally, if a dataset has standard training and test sets and we do not need to adopt cross-validation. What can I do to revise the codes?

@yabufarha
Copy link
Owner

The .bundle files contain the list of examples for each split. We do not generate those files and they are the standard training and testing set for the used datasets.
If you want to test the code on split 1 of 50salads, for example, then you need to run
python main.py --action=train --dataset=50salads --split=1.
The code uses these parameters to access to corresponding training and testing examples as listed in the .bundle files.

If you want to use another dataset, you only need to pass the corresponding list of training and testing examples (and maybe update the features dimension if you are using different features).

@Felix1014
Copy link
Author

got it. Thank you so much @yabufarha.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants