Initializing segment embeddings on Transformer Encoder #3679

wade3han · 2021-05-28T02:08:22Z

Hello! I figured out that the code is not initializing the segment embeddings on Transformer encoder: https://github.com/facebookresearch/ParlAI/blob/master/parlai/agents/transformer/modules/encoder.py#L201-L202

I found that initializing the segment embeddings induces the model to converge faster, when fine-tuning the seq2seq model (such as Blender) with the usage of segment embeddings. (Note that the pre-trained model didn't use the segment embeddings)
I guess adding the initialization code as below will be helpful.

        if self.n_segments >= 1:
            self.segment_embeddings = nn.Embedding(self.n_segments, self.dim)
            nn.init.normal_(self.segment_embeddings.weight, 0, self.dim ** -0.5)

Anyway, it may be an intended feature, then is there reason you didn't apply the initialization for the segment embeddings?

The text was updated successfully, but these errors were encountered:

stephenroller · 2021-05-28T18:47:31Z

Nah that was overlooked. That's a good point. I'd happily accept a patch.

wade3han pushed a commit to wade3han/ParlAI that referenced this issue May 30, 2021

Patch for encoder segment_embeddings (facebookresearch#3679)

21e79b5

wade3han pushed a commit to wade3han/ParlAI that referenced this issue May 30, 2021

Patch for encoder segment_embeddings (facebookresearch#3679)

237f819

wade3han mentioned this issue May 30, 2021

Patch for segment embeddings init #3680

Merged

wade3han closed this as completed Jun 2, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Initializing segment embeddings on Transformer Encoder #3679

Initializing segment embeddings on Transformer Encoder #3679

wade3han commented May 28, 2021 •

edited

Loading

stephenroller commented May 28, 2021

Initializing segment embeddings on Transformer Encoder #3679

Initializing segment embeddings on Transformer Encoder #3679

Comments

wade3han commented May 28, 2021 • edited Loading

stephenroller commented May 28, 2021

wade3han commented May 28, 2021 •

edited

Loading