Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

number of train caption is < 10000 #37

Open
piperino11 opened this issue Aug 31, 2019 · 7 comments
Open

number of train caption is < 10000 #37

piperino11 opened this issue Aug 31, 2019 · 7 comments

Comments

@piperino11
Copy link

Msr vtt dataset have 10000 videos and 20 captions for each video but in this implementation only a video-caption pair in train phase is considered. Therefore in total <= 10000 example for train.
someone has seen the same thing????
has anyone changed the code?

@chongkewu
Copy link

For each epoch the training caption will change. It will sample 1 of the 20 captions everytime when you get item from video dataset, you can check out the dataloader.py file

@alokssingh
Copy link

alokssingh commented Mar 24, 2020

Hey @chongkewu hope you are doing well.
I have a query hope you have a answer.
For each video we have 20 refrence captions so from your above ans what i understand is that for every epoch it will select randomly one captions from available 20 captions. Isn't ?

@chongkewu
Copy link

chongkewu commented Mar 24, 2020 via email

@alokssingh
Copy link

thank you @chongkewu.
Do you think that in this way the model will be trained sufficiently?

@chongkewu
Copy link

chongkewu commented Mar 24, 2020 via email

@alokssingh
Copy link

alokssingh commented Mar 24, 2020

@chongkewu thank you so much for your instant replies.
Will try some new approaches and will let you inform about the performance.

@alokssingh
Copy link

@chongkewu After selecting the caption randomly do we training the model in such a

	X1		X2(text sequence) 								y(word)
	-----------------------------------------------------------------
	image	startseq,										little
	image	startseq, little,								        girl
	image	startseq, little, girl,							       running
	image	startseq, little, girl, running,				                in
	image	startseq, little, girl, running, in,			                field
	image	startseq, little, girl, running, in, field,		              endseq

or just directly passing image and whole caption to the model?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants