Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

why the batch_size is a parameter of the layer #13

Open
menglin0320 opened this issue Jul 11, 2016 · 3 comments
Open

why the batch_size is a parameter of the layer #13

menglin0320 opened this issue Jul 11, 2016 · 3 comments

Comments

@menglin0320
Copy link

menglin0320 commented Jul 11, 2016

I was tying to use your data layer to write an auto text generator. But then I realized it is impossible to do so because I have to input data that is dividable by batch_size but I can only input one by one when testing the layer.
At training time, I know the whole text so I can have a batch size bigger than one. For example, for the word hello, my data are [none h e l l] and my labels are [h e l l o]
But at test time. For example, I give the net 0(for none) at the beginning of the testing, ideally, the net predicts h,then I use h as input .The input is produced by the net so I can't have a batch size bigger than one at test time. And your implenmentation doesn't allow us to change batch size.
Am I right?

@junhyukoh
Copy link
Owner

Lstm layer assumes that the input (0-th bottom) looks like [N x T][...] where N is batch_size, T is length, and [...] is the feature dimension.
By default, batch_size = 1, so the layer assumes that the bottom shape is [T][...].
If you want to use batch size larger than 1, you should 1) provide input in [N x T][...] shape and 2) specify "batch_size: N" into your prototxt. Otherwise, lstm layer will assume that your input is NxT-long sequence with batch size of 1.

Providing input in [NxT][...] shape is a little bit confusing.
My suggestion is to fill your input data like [N][T][...] and use "Reshape" layer to make it [NxT][...] before lstm layer.

I hope this is clear to you.

@menglin0320
Copy link
Author

I changed my question. Can you please help?? @junhyukoh

@menglin0320
Copy link
Author

menglin0320 commented Jul 11, 2016

Also, what is clip_threshold used for... Also I'm confused about clip,isn't that similar to different batches, what is the difference

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants