Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

design of RNNOp #3727

Merged
merged 31 commits into from
Sep 14, 2017
Merged

design of RNNOp #3727

merged 31 commits into from
Sep 14, 2017

Conversation

Superjomn
Copy link
Contributor

@Superjomn Superjomn commented Aug 28, 2017

Fixes #3823

- init_memory, the variable to help initialize memory

### step scopes
Each RNN has more than one step times, and the stepnet will be executed in every step time.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The RNN might run one or more steps.

@lcy-seso lcy-seso changed the title design of RNN for fix-length-setence design of RNN for fix-length-sentence Sep 1, 2017

<p aligh="center">
<img src="./images/rnn.png"/><br/>
fig 2 the RNN's data flow
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fig 2 ==> Figure 2


There are several important concepts:

- stepnet, the network execute in every time step
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

stepnet => step-net


There are several important concepts:

- stepnet, the network execute in every time step
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the network to be executed in each step

- init-memory, the variable to help initialize state in the first time step.

### step scopes

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Step Scope

- init-memory, the variable to help initialize state in the first time step.

### step scopes

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The step-net could have local variables defined. In each step of RNN execution, a scope is created to hold corresponding variables. Such a scope is known as a step scope.

h_t = U h_{t-1} + W x_t
$$

Here, $h_t$ is time $t$'s state, $h_t$ is time $t-1$'s state, in implementation, we call the a variable that store a state memory.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

", in implementation, we call the a variable that store a state memory." can be deleted

$$

Here, $h_t$ is time $t$'s state, $h_t$ is time $t-1$'s state, in implementation, we call the a variable that store a state memory.
In step time $t$, $h_t$ is memory, $h_{t-1}$ is pre-memory (short for previous memory).
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"In step time $t$, $h_t$ is memory, $h_{t-1}$ is pre-memory (short for previous memory)." can be deleted

Here, $h_t$ is time $t$'s state, $h_t$ is time $t-1$'s state, in implementation, we call the a variable that store a state memory.
In step time $t$, $h_t$ is memory, $h_{t-1}$ is pre-memory (short for previous memory).

In each step scope
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+In each step scope
+- each memory variable has a corresponding pre-memory variable
+- before a time step executes, copy (or make a reference) the value of previous step scope's memory to the pre-memory variable in current step scope.

=>

In the implementation, we can make an ex-memory variable either "refers to" the memory variable of the previous step, or copy the value of the previous memory variable to the current ex-memory variable.

- each memory variable has a corresponding pre-memory variable
- before a time step executes, copy (or make a reference) the value of previous step scope's memory to the pre-memory variable in current step scope.

### C++ API
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The C++ API

- void Run(const framework::Scope& scope, const platform::DeviceContext& dev_ctx) const;
- run all the time steps.

### User interface
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Python Interface

@Superjomn Superjomn changed the title design of RNN for fix-length-sentence design of RNNOp Sep 2, 2017

rnn = pd.create_rnn_op(output_num=1)
with rnn.stepnet():
x = rnn.add_input(X)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This example uses rnn.add_input. But the next example uses rnn.segment_input. Are they the same?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, I will change all to rnn.add_input

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

need to differentiate two types of input: sequence input and static input. Each instance has different static input. But for one instance, it's same across all time steps.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the static inputs will be treated as global variables and that doesn't need to be passed as input.

the add_input statement only mark the sequence input that needs to be segmented for RNN's step times. @emailweixu

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

static inputs are different from parameters. They will still need to be splitted according to whether that instance is participating at a timestep, where parameters do not need to be splitted.


We can define an RNN's step-net using Block:

```python
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this API works with attention model?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This syntax should be compatible with Paddle V1, but without the support of Beam Search.

# update current memory
h.update(new_state)
# indicate that h variables in all step scopes should be merged
rnn.set_output(0, h)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does "0" mean in set_output? Every set_output in this PR uses "0" as argument.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

0 means the "0-th" argument

h.update(
pd.matmul(W, sentence) + pd.matmul(U, h.pre_state()))
# get the last state as sentence's info
rnn.set_output(0, h)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the 0 here indicating the first output?

How can we specify that an RNN should return just the output from the last step?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rnn = pd.create_rnn_op()
with rnn.stepnet():
    x = rnn.set_inputs(X)
    # declare a memory (rnn's step)
    h = rnn.add_memory(init=a)
    # h.pre_state() means previous memory of rnn
    new_state = pd.add_two( pd.matmul(W, x) + pd.matmul(U, h.pre_state()))
    # update current memory
    h.update(new_state)
    # indicate that h variables in all step scopes should be merged
    rnn.set_outputs(h)

# output last step
out = rnn(output_all_steps=False)

can we use the argument output_all_steps to output all steps or just the last step?

@Superjomn Superjomn merged commit b3f6b5a into PaddlePaddle:develop Sep 14, 2017
@Superjomn Superjomn deleted the rnn_design branch September 14, 2017 01:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants