Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hotfixes for DistilBERT adapter & AdapterFusion implementations #102

Merged
merged 1 commit into from
Dec 10, 2020

Conversation

calpt
Copy link
Member

@calpt calpt commented Dec 7, 2020

This PR fixes the following for DistilBERT:

  • AdapterFusion regularization by moving the fusion regularization loss implementation to the model classes
  • the layer norm in the Transformer block: The layer norm module is part of the TransformerBlock class (different from BERT implementation), however we need to access it from the adapter module, which is a submodule of the former. Therefore, we need some sort of reference in the submodule. The previous implementation didn't work as it copied the layer norm weights.

@calpt calpt added the bug Something isn't working label Dec 7, 2020
@calpt calpt marked this pull request as ready for review December 7, 2020 17:16
@calpt calpt requested a review from arueckle December 10, 2020 09:27
@calpt calpt merged commit 243ebda into adapter-hub:master Dec 10, 2020
@calpt calpt deleted the fix/distilbert_adapters branch December 10, 2020 09:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant