Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Language modeling flex heads #210

Merged
merged 14 commits into from
Aug 24, 2021
Merged

Language modeling flex heads #210

merged 14 commits into from
Aug 24, 2021

Conversation

calpt
Copy link
Member

@calpt calpt commented Jul 23, 2021

Closes #53.

Waiting for #208.


This PR adds three different prediction heads for XModelWithHeads classes, depending on the model architecture:

  • add_causal_lm_head() adds a causal LM head for classes that support this type of head in transformers, e.g. GPT-2, BERT, ...
  • add_masked_lm_head() adds a masked LM head for models with MLM, e.g. BERT, RoBERTa, ...
  • add_seq2seq_lm_head() adds a sequence-to-sequence LM head for encoder-decoder models, e.g. BART

All heads can be automatically converted from their respective static-head counterparts (e.g. seq2seqlm from BartForConditionalGeneration).

To ensure that all conversions work as expected, a new test module was added in test_adapter_conversion.py.

@calpt calpt marked this pull request as ready for review July 27, 2021 13:27
@calpt calpt requested a review from hSterz August 16, 2021 16:12
Copy link
Member

@hSterz hSterz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good

@calpt calpt merged commit 84289df into adapter-hub:master Aug 24, 2021
@calpt calpt deleted the dev/lm_heads branch August 24, 2021 08:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Language modeling head for flexible head classes
2 participants