Why not use updated parent beam's embedding as next input when inferring? #8

nkqiaolin · 2017-10-24T11:31:17Z

Nice implementation first!

when reading your code ConvDecoderFairseqBS, in each step(), beam_search will select new top K beams, but the inputs to next step is just replace this time step's padded zero with this top K words. The decode sequence from 0 to time-1 is kept as before. I'm confused here, since the new top K words may come from different beams with previous steps'. Is this by design or just a mistake plz? :)

Specific code is here:

cur_inputs = inputs[:,0:time+1,:] 
zeros_padding = inputs[:,time+2:,:] 
cur_inputs_pos = self.add_position_embedding(cur_inputs, time)

enc_output, beam_state = state 
logits = self.infer_conv_block(enc_output, cur_inputs_pos)
bs_output, beam_state = beam_search.beam_search_step(.....

finished, next_inputs = self.next_inputs(sample_ids=bs_output.predicted_ids)
next_inputs = tf.reshape(next_inputs, [self.config.beam_width, 1, inputs.get_shape().as_list()[-1]])
next_inputs = tf.concat([cur_inputs, next_inputs], axis=1) ## *** why not update cur_inputs to the new beams? **
next_inputs = tf.concat([next_inputs, zeros_padding], axis=1)

The text was updated successfully, but these errors were encountered:

nkqiaolin · 2017-10-24T12:06:10Z

I checked the fairseq-py, which will re-order the input for every beam search step:

def reorder_buffer(self, new_order):
    if self.input_buffer is not None:
        self.input_buffer = self.input_buffer.index_select(0, new_order)

nkqiaolin changed the title ~~Why not use updated parent beam's embedding as input when inferring?~~ Why not use updated parent beam's embedding as next input when inferring? Oct 24, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Why not use updated parent beam's embedding as next input when inferring? #8

Why not use updated parent beam's embedding as next input when inferring? #8

nkqiaolin commented Oct 24, 2017 •

edited

Loading

nkqiaolin commented Oct 24, 2017 •

edited

Loading

Why not use updated parent beam's embedding as next input when inferring? #8

Why not use updated parent beam's embedding as next input when inferring? #8

Comments

nkqiaolin commented Oct 24, 2017 • edited Loading

nkqiaolin commented Oct 24, 2017 • edited Loading

nkqiaolin commented Oct 24, 2017 •

edited

Loading

nkqiaolin commented Oct 24, 2017 •

edited

Loading