design of dynamic rnn #4401

Superjomn · 2017-09-26T22:47:24Z

No description provided.

lcy-seso

This design is clear to me, thank you.

lcy-seso · 2017-09-30T04:39:58Z

doc/design/dynamic_rnn_beamsearch.md

+with whl.loop(exec_immediately=True):
+ time_ = whl.loop_var(time)
+ to_stop = pd.less_than(time_, num_steps)
+ whl.break_if(to_stop)


The design doc (the dynamic RNN part, only this with still cannot support beam search) is clear to me. Thanks for the doc.

I just want to talk about some of my understanding, maybe you can help me to check it.

The least required features for dynamic RNN training.

For a no padding dynamic RNN training (generation will be more complicated), I think three components are the least requirements:

TensorArray as input and output, which sorts the input batch, and re-orders the sorted output to its original order. This helps to achieve a no-padding RNN.

A user-defined step function ( a sub-graph) which describes computations a RNN performs in a single time step.

the step function is a required parameter to the while_loop operator.

if the step function takes previous_state as its input, then it is a recurrent unit; otherwise it acts like map function in python.

The while_loop operator.

I guess the while_loop is very like an executor?

if the step function does not takes previous_state as its input, while_loop just apply a function to every item in a TensorArray, and returns a TensorArray.

For a dynamic RNN forward pass, first, the step function is iteratively executed over the entire input TensorArray ; second, execute the condition check to determine whether to stop expanding the step function (a graph);

The framework takes the responsibility to construct the backward graph based on the expansion step of forwarding computations.

something I haven't thought very carefully yet.

As I understand, the while_loop is a dynamic operator, am I right?

This means the while_loop accepts an iterable data input (the TensorArrary), it dynamically iterates over the input rather than pre-expand the entire graph (just an expanded feed-forward network)

What will happen if the while_loop operator be nested for two times, or even more than that (for short-time goals, like RecurrentLayerGroup for nested sequence)?

Can two TensorArray works together, for example, one TensorArray packs a sequence and returns a indices map, and other (more than one) TensorArray packs other sequences by using this indices map. This is useful for attention and NTM models.

For beam generation, and even beam training

I think one of the most difficult things in beam search is we have to dynamically construct the beam in every time step, this involves operators like scatter/gather/k max score/sequence trim, and so one.

lcy-seso · 2017-09-30T05:00:08Z

doc/design/dynamic_rnn_beamsearch.md

+# whl.loop(), whl() will be called
+with whl.loop(exec_immediately=True):
+ time_ = whl.loop_var(time)
+ to_stop = pd.less_than(time_, num_steps)


iterating over the input sequence can be a default condition check for dynamic RNN training/testing.

For beam training/generation more complicated condition is required.

Superjomn added 5 commits September 25, 2017 14:47

init md

5933ea8

add design

92daedf

grammer fix

81a629c

finish RNN design

5f8ecd5

add images

fa00c72

Superjomn assigned zchen0211 and unassigned zchen0211 Sep 26, 2017

Superjomn requested review from lcy-seso, qingqing01, zchen0211, luotao1 and abhinavarora September 26, 2017 22:48

lcy-seso reviewed Sep 30, 2017

View reviewed changes

Superjomn closed this May 12, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

design of dynamic rnn #4401

design of dynamic rnn #4401

Superjomn commented Sep 26, 2017

lcy-seso left a comment

lcy-seso Sep 30, 2017 •

edited

Loading

lcy-seso Sep 30, 2017

design of dynamic rnn #4401

design of dynamic rnn #4401

Conversation

Superjomn commented Sep 26, 2017

lcy-seso left a comment

Choose a reason for hiding this comment

lcy-seso Sep 30, 2017 • edited Loading

Choose a reason for hiding this comment

The least required features for dynamic RNN training.

something I haven't thought very carefully yet.

For beam generation, and even beam training

lcy-seso Sep 30, 2017

Choose a reason for hiding this comment

lcy-seso Sep 30, 2017 •

edited

Loading