Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GRU fix #250

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open

GRU fix #250

wants to merge 1 commit into from

Conversation

Jerry-Master
Copy link

Looking at your formulas in the article I see your implementation of the GRU does not coincide with the code you provide. I don't want you to merge this fork since it would break compatibility. But I leave it here in case you want to discuss the performance of this fixed ConvGRU implementation. It seems you are recycling the hidden states as if it was the forward activation. It is a valid approach, but I see more reasonable to separate between hidden state and forward activation.

@PeterL1n
Copy link
Owner

Unlike LSTM, GRU by design does not have separate hidden and forward output. They share the same. See this diagram.
image

The (1 - z) was opposite to the paper notation but they are equivalent.
So I believe my original implementation was correct.

@Jerry-Master
Copy link
Author

I mean, you say in the article ot is the output of the layer and h is the hidden state. So it makes sense that you pass the output to the next layer and the hidden state to the next time step. I was wondering if you tried, or have some intuition on which option is better in performance because, computationally, they are very similar.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants