Skip to content
This repository has been archived by the owner on Nov 3, 2023. It is now read-only.

distillation test breakage demonstration #4526

Closed
wants to merge 1 commit into from

Conversation

spencerp
Copy link
Contributor

@spencerp spencerp commented May 3, 2022

minimum change required to break TestBartDistillation::test_narrow_distillation_losses

@spencerp
Copy link
Contributor Author

spencerp commented May 3, 2022

This is a result of changing the order of random operations (ones that happen during module initialization) given a fixed random seed. Thanks @EricMichaelSmith for helping debug!

Doesn't break test (ModuleList order different, initialization order preserved):

        self.norm2 = torch.nn.LayerNorm(embedding_size, eps=LAYER_NORM_EPS)
        self.norm3 = torch.nn.LayerNorm(embedding_size, eps=LAYER_NORM_EPS)

        encoder_attention = self.swappables.encoder_attention(
            opt=self.opt, n_heads=n_heads, dim=embedding_size, dropout=attention_dropout
        )  # type: ignore
        ffn = self.swappables.feedforward(
            opt=self.opt,
            dim=embedding_size,
            dim_hidden=ffn_size,
            relu_dropout=relu_dropout,
            activation=activation,
        )  # type: ignore

        self.ffn = ffn
        self.encoder_attention = encoder_attention

Breaks test (ModuleList order preserved, initialization order different):

        ffn = self.swappables.feedforward(
            opt=self.opt,
            dim=embedding_size,
            dim_hidden=ffn_size,
            relu_dropout=relu_dropout,
            activation=activation,
        )  # type: ignore
        encoder_attention = self.swappables.encoder_attention(
            opt=self.opt, n_heads=n_heads, dim=embedding_size, dropout=attention_dropout
        )  # type: ignore

        self.encoder_attention = encoder_attention
        self.ffn = ffn

@spencerp spencerp closed this May 3, 2022
@spencerp spencerp deleted the dist-demo-not-to-merge branch May 3, 2022 19:13
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants