Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

design of dynamic rnn #4401

Closed
wants to merge 5 commits into from

Conversation

Superjomn
Copy link
Contributor

No description provided.

Copy link
Contributor

@lcy-seso lcy-seso left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This design is clear to me, thank you.

with whl.loop(exec_immediately=True):
time_ = whl.loop_var(time)
to_stop = pd.less_than(time_, num_steps)
whl.break_if(to_stop)
Copy link
Contributor

@lcy-seso lcy-seso Sep 30, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The design doc (the dynamic RNN part, only this with still cannot support beam search) is clear to me. Thanks for the doc.

I just want to talk about some of my understanding, maybe you can help me to check it.


The least required features for dynamic RNN training.

For a no padding dynamic RNN training (generation will be more complicated), I think three components are the least requirements:

  1. TensorArray as input and output, which sorts the input batch, and re-orders the sorted output to its original order. This helps to achieve a no-padding RNN.

  2. A user-defined step function ( a sub-graph) which describes computations a RNN performs in a single time step.

    • the step function is a required parameter to the while_loop operator.
    • if the step function takes previous_state as its input, then it is a recurrent unit; otherwise it acts like map function in python.
  3. The while_loop operator.

    • I guess the while_loop is very like an executor?
    • if the step function does not takes previous_state as its input, while_loop just apply a function to every item in a TensorArray, and returns a TensorArray.
    • For a dynamic RNN forward pass, first, the step function is iteratively executed over the entire input TensorArray ; second, execute the condition check to determine whether to stop expanding the step function (a graph);
    • The framework takes the responsibility to construct the backward graph based on the expansion step of forwarding computations.

something I haven't thought very carefully yet.

  • As I understand, the while_loop is a dynamic operator, am I right?

    • This means the while_loop accepts an iterable data input (the TensorArrary), it dynamically iterates over the input rather than pre-expand the entire graph (just an expanded feed-forward network)
  • What will happen if the while_loop operator be nested for two times, or even more than that (for short-time goals, like RecurrentLayerGroup for nested sequence)?

  • Can two TensorArray works together, for example, one TensorArray packs a sequence and returns a indices map, and other (more than one) TensorArray packs other sequences by using this indices map. This is useful for attention and NTM models.


For beam generation, and even beam training

I think one of the most difficult things in beam search is we have to dynamically construct the beam in every time step, this involves operators like scatter/gather/k max score/sequence trim, and so one.

# whl.loop(), whl() will be called
with whl.loop(exec_immediately=True):
time_ = whl.loop_var(time)
to_stop = pd.less_than(time_, num_steps)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

iterating over the input sequence can be a default condition check for dynamic RNN training/testing.

For beam training/generation more complicated condition is required.

@Superjomn Superjomn closed this May 12, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants