Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use of default clip markers as [0,1,1, 1...,1] #14

Closed
aurotripathy opened this issue Jul 25, 2016 · 7 comments
Closed

Use of default clip markers as [0,1,1, 1...,1] #14

aurotripathy opened this issue Jul 25, 2016 · 7 comments

Comments

@aurotripathy
Copy link

aurotripathy commented Jul 25, 2016

This seems like the right place to get answers to Caffe LSTM questions :-). You can count on an answer.

I'm comparing the implementation of the LSTM layer over here and the (official merged) one in Caffe. They are different.

Are they conceptually the same relative to the clip_marker implementation?

My question is, if the sequence lengths are the same in the input (i.e., they don't vary) and they match the number of time-steps, then do we need to provide the clip_marker input (in the official caffe version)?

Can the network assume it to be [0,1,1, 1...,1]?

My reason to ask this is to debug the network. My own markers may be in error and likely confusing the network?

Thank you.

@junhyukoh
Copy link
Owner

Yes.
My code assumes that the input batch consists of complete sequences (from end to end).
If this is the case, you don't have to provide clip markers.

@aurotripathy
Copy link
Author

aurotripathy commented Jul 27, 2016

Thank you. What about the official Caffe LSTM implementation (BVLC/caffe#2033).
Asking here as it's unlikely I will get a response there.

@junhyukoh
Copy link
Owner

As far as I know, they have the same protocol.

@aurotripathy
Copy link
Author

aurotripathy commented Jul 27, 2016

Thank you. One last question.

The Caffe code below is the lstm unit layer implementation. I'm unable to determine whether the cont variable has a default value of zero (or has to be strictly supplied as a "bottom").
Please can you help

template <typename Dtype>
void LSTMUnitLayer<Dtype>::Forward_cpu(const vector<Blob<Dtype>*>& bottom,
    const vector<Blob<Dtype>*>& top) {
  const int num = bottom[0]->shape(1);
  const int x_dim = hidden_dim_ * 4;
  const Dtype* C_prev = bottom[0]->cpu_data();
  const Dtype* X = bottom[1]->cpu_data();
  const Dtype* cont = bottom[2]->cpu_data();
  Dtype* C = top[0]->mutable_cpu_data();
  Dtype* H = top[1]->mutable_cpu_data();
  for (int n = 0; n < num; ++n) {
    for (int d = 0; d < hidden_dim_; ++d) {
      const Dtype i = sigmoid(X[d]);
      const Dtype f = (*cont == 0) ? 0 :
          (*cont * sigmoid(X[1 * hidden_dim_ + d]));
      const Dtype o = sigmoid(X[2 * hidden_dim_ + d]);
      const Dtype g = tanh(X[3 * hidden_dim_ + d]);
      const Dtype c_prev = C_prev[d];
      const Dtype c = f * c_prev + i * g;
      C[d] = c;
      const Dtype tanh_c = tanh(c);
      H[d] = o * tanh_c;
    }
    C_prev += hidden_dim_;
    X += x_dim;
    C += hidden_dim_;
    H += hidden_dim_;
    ++cont;
  }
}

@junhyukoh
Copy link
Owner

It seems like there is no default value in this code unless they provide a virtual bottom[2].

@aurotripathy
Copy link
Author

Ok, thank you very much.

From BVLC/caffe#2033, it appears that providing the clip_markers is "required".

"RecurrentLayer requires 2 input (bottom) Blobs."

@ayushchopra96
Copy link

Hi @junhyukoh @aurotripathy . Is there support to access Hidden state at each timestep.
I needed to simulate an attention mechanism. If not, I would need to implement the same myself.

Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants