Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Why not use updated parent beam's embedding as next input when inferring? #8

Open
nkqiaolin opened this issue Oct 24, 2017 · 1 comment

Comments

@nkqiaolin
Copy link

nkqiaolin commented Oct 24, 2017

Nice implementation first!

when reading your code ConvDecoderFairseqBS, in each step(), beam_search will select new top K beams, but the inputs to next step is just replace this time step's padded zero with this top K words. The decode sequence from 0 to time-1 is kept as before. I'm confused here, since the new top K words may come from different beams with previous steps'. Is this by design or just a mistake plz? :)

Specific code is here:

cur_inputs = inputs[:,0:time+1,:] 
zeros_padding = inputs[:,time+2:,:] 
cur_inputs_pos = self.add_position_embedding(cur_inputs, time)

enc_output, beam_state = state 
logits = self.infer_conv_block(enc_output, cur_inputs_pos)
bs_output, beam_state = beam_search.beam_search_step(.....

finished, next_inputs = self.next_inputs(sample_ids=bs_output.predicted_ids)
next_inputs = tf.reshape(next_inputs, [self.config.beam_width, 1, inputs.get_shape().as_list()[-1]])
next_inputs = tf.concat([cur_inputs, next_inputs], axis=1) ## *** why not update cur_inputs to the new beams? **
next_inputs = tf.concat([next_inputs, zeros_padding], axis=1)
@nkqiaolin nkqiaolin changed the title Why not use updated parent beam's embedding as input when inferring? Why not use updated parent beam's embedding as next input when inferring? Oct 24, 2017
@nkqiaolin
Copy link
Author

nkqiaolin commented Oct 24, 2017

I checked the fairseq-py, which will re-order the input for every beam search step:

def reorder_buffer(self, new_order):
    if self.input_buffer is not None:
        self.input_buffer = self.input_buffer.index_select(0, new_order)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant