Replies: 1 comment 1 reply
-
Yes, I agree with the proposition. A consistent API would be great and technically, RNN API is experimental so we should be able to make changes to it in v0.4 itself. Just one modification the return would be |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Currently
LSTMCell
returns the new hidden state in a tuple(hidden_state_new, memory_new)
, whileRNN
- orGRUCell
just return the new hidden statehidden_state_new
.To write code that is generic over these different cells, AFAICT one needs to either dispatch on the shape of the hidden state (
Tuple
vsAbstractMatrix
) or use a trick such asbatchmemaybe
(from Flux) combined with splatting:There's a similar problem for getting the output of a recurrent cell. For
RNN
,GRU
and similar cells, the output is the same thing as the hidden state, forLSTMCell
however, usually the output is only the first element of the returned tuple.Would it make sense to have recurrent cells return both an output (array) as well as a tuple containing the new hidden state? e.g. a GRUCell would return
hidden_state_new, (hidden_state_new, ), st
.Or is there a simpler solution to these problems that I've missed?
Beta Was this translation helpful? Give feedback.
All reactions