Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add DeBERTa #1295

Merged
merged 15 commits into from
Apr 23, 2021
Merged

Add DeBERTa #1295

merged 15 commits into from
Apr 23, 2021

Conversation

jeswan
Copy link
Collaborator

@jeswan jeswan commented Mar 19, 2021

The PR adds DeBERTa V2 support to jiant. The PR also includes documentation for how to add a model with the latest changes in js/feature/easy_add_model.

jiant/shared/model_resolution.py Outdated Show resolved Hide resolved
jiant/proj/main/modeling/primary.py Outdated Show resolved Hide resolved
@jeswan jeswan force-pushed the js/feature/add_deberta branch from 659405c to 9f587ce Compare April 8, 2021 15:48
@codecov
Copy link

codecov bot commented Apr 8, 2021

Codecov Report

Merging #1295 (046e4bb) into js/feature/easy_add_model (4ddc5ac) will decrease coverage by 0.17%.
The diff coverage is 48.27%.

❗ Current head 046e4bb differs from pull request most recent head d2d4894. Consider uploading reports for the commit d2d4894 to get more accurate results
Impacted file tree graph

@@                      Coverage Diff                      @@
##           js/feature/easy_add_model    #1295      +/-   ##
=============================================================
- Coverage                      49.83%   49.66%   -0.18%     
=============================================================
  Files                            162      162              
  Lines                          11170    11191      +21     
=============================================================
- Hits                            5567     5558       -9     
- Misses                          5603     5633      +30     
Impacted Files Coverage Δ
jiant/proj/main/modeling/model_setup.py 22.76% <0.00%> (-0.38%) ⬇️
jiant/proj/main/modeling/primary.py 52.25% <40.90%> (-1.18%) ⬇️
jiant/shared/model_resolution.py 78.37% <100.00%> (+0.60%) ⬆️
jiant/proj/main/export_model.py 41.37% <0.00%> (-48.28%) ⬇️
jiant/utils/python/io.py 52.72% <0.00%> (-5.46%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 4ddc5ac...d2d4894. Read the comment docs.

@jeswan jeswan force-pushed the js/feature/add_deberta branch from d9f8cb4 to 046e4bb Compare April 8, 2021 17:58
@jeswan jeswan marked this pull request as ready for review April 8, 2021 18:00
@jeswan jeswan requested a review from HaokunLiu as a code owner April 8, 2021 18:00
Copy link
Collaborator

@zphang zphang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor documentation comments.

@@ -160,6 +160,9 @@ def load_encoder_from_transformers_weights(
for k, v in weights_dict.items():
if k.startswith(encoder_prefix):
load_weights_dict[strings.remove_prefix(k, encoder_prefix)] = v
elif k.startswith(encoder_prefix.split("-")[0]):
# workaround for deberta-v2
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add more detail for this comment?

@@ -0,0 +1,70 @@
# Adding a model

`jiant` supports or can easily be exteneded to support Hugging Face Hugging Face's [Transformer models](https://huggingface.co/transformers/viewer/) since `jiant` utilizes [Auto Classes](https://huggingface.co/transformers/model_doc/auto.html) to determine the architecture of the model used based on the name of the [pretrained model](https://huggingface.co/models). To add a model not currently supported in `jiant`, follow the following steps:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typos: "exteneded", "Hugging Face Hugging Face's"

Maybe add a clarifying explanation (and let me know if this is wrong): We do use AutoModels to resolve the model in jiant, but in order to ensure the jiant pipeline works correctly (e.g. matching the correct tokenizer) and to deal with some subtle differences between models, jiant needs to know some additional information/supports specific handling for specific models, so some additional steps are needed to set up a new model from HF in jiant.

class DebertaV2MLMHead(BaseMLMHead):
...
````

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Conclude with "you should now be able to ..."

@jeswan jeswan merged commit e8536a9 into js/feature/easy_add_model Apr 23, 2021
@jeswan jeswan deleted the js/feature/add_deberta branch April 23, 2021 17:12
jeswan added a commit that referenced this pull request May 4, 2021
* Update to Transformers v4.3.3 (#1266)

* use default return_dict in taskmodels and remove hidden state context manager in models.

* return hidden states in output of model wrapper

* Switch to task model/head factories instead of embedded if-else statements (#1268)

* Use jiant transformers model wrapper instead of if-else. Use taskmodel and head factory instead of if-else.

* switch to ModelArchitectures enum instead of strings

* Refactor get_output_from_encoder() to be member of JiantTaskModel (#1283)

* refactor getting output from encoder to be member function of jiant model

* switch to explicit encode() in jiant transformers model

* fix simple runscript test

* update to tokenizer 0.10.1

* Add tests for flat_strip() (#1289)

* add flat_strip test

* add list to test cases flat_strip

* mlm_weights(), feat_spec(), flat_strip() if-else refactors (#1288)

* moves remaining if-else statments to jiant model or replaces with model agnostic method

* switch from jiant_transformers_model to encoder

* fix bug in flat_strip()

* Move tokenization logic to central JiantModelTransformers method (#1290)

* move model specific tokenization logic to JiantTransformerModels

* implement abstract methods for JiantTransformerModels

* fix tasks circular import (#1296)

* Add DeBERTa (#1295)

* Add DeBERTa with sanity test

* fix tasks circular import

* [WIP] add deberta tests

* Revert "fix tasks circular import"

This reverts commit f924640.

* deberta tests passing with transformers 6472d8

* switch to deberta-v2

* fix get_mlm_weights_dict() for deberta-v2

* update to transformers 4.5.0

* mark deberta test_export as slow

* Update test_tokenization_normalization.py

* add guide to add a model

* fix test_expor_model tests

* minor pytest fixes (add num_labels for rte, overnight flag fix)

* bugfix for simple api notebook

* bugfix for #1310

* bugfix for #1306: simple api notebook path name

* squad running

* 2nd bugfix for #1310: not all tasks have num_labels property

* simple api notebook back to roberta-base

* run test matrix for more steps to compare to master

* save last/best model test fix

Co-authored-by: Jesse Swanson <js11133Wnyu.edu>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants