Generic recurrent cells #135

jumerckx · 2022-08-16T15:21:02Z

jumerckx
Aug 16, 2022

Currently LSTMCell returns the new hidden state in a tuple (hidden_state_new, memory_new), while RNN- or GRUCell just return the new hidden state hidden_state_new.
To write code that is generic over these different cells, AFAICT one needs to either dispatch on the shape of the hidden state (Tuple vs AbstractMatrix) or use a trick such as batchmemaybe (from Flux) combined with splatting:

batchmemaybe(x) = tuple(x)
batchmemaybe(x::Tuple) = x

(g::GenericCell)((x, hidden), ps, st) = g.cell(x, batchmemaybe(hidden)...), ps, st) # g.cell could be a RNNCell, LSTMCell...

There's a similar problem for getting the output of a recurrent cell. For RNN, GRU and similar cells, the output is the same thing as the hidden state, for LSTMCell however, usually the output is only the first element of the returned tuple.

hidden, st = genericCell((x, hidden), ps, st)
out = first(batchmemaybe(hidden))

Would it make sense to have recurrent cells return both an output (array) as well as a tuple containing the new hidden state? e.g. a GRUCell would return hidden_state_new, (hidden_state_new, ), st.
Or is there a simpler solution to these problems that I've missed?

avik-pal · 2022-08-17T05:06:13Z

avik-pal
Aug 17, 2022
Maintainer

Yes, I agree with the proposition. A consistent API would be great and technically, RNN API is experimental so we should be able to make changes to it in v0.4 itself.

Just one modification the return would be (hidden_state_new, (hidden_state_new, )), st so basically ((output, carry), st) which is similar to what flax does.

1 reply

jumerckx Aug 17, 2022
Author

I'm willing to write a PR for this. To be sure where on the same page, carry for RNN and GRU would be a single-element tuple, right?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LuxDL

Generic recurrent cells #135

{{title}}

Replies: 1 comment 1 reply

{{title}}

{{title}}

Select a reply

LuxDL

Generic recurrent cells #135

jumerckx Aug 16, 2022

Replies: 1 comment · 1 reply

avik-pal Aug 17, 2022 Maintainer

jumerckx Aug 17, 2022 Author

jumerckx
Aug 16, 2022

Replies: 1 comment 1 reply

avik-pal
Aug 17, 2022
Maintainer

jumerckx Aug 17, 2022
Author