-
Notifications
You must be signed in to change notification settings - Fork 296
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add DeBERTa #1295
Add DeBERTa #1295
Conversation
659405c
to
9f587ce
Compare
Codecov Report
@@ Coverage Diff @@
## js/feature/easy_add_model #1295 +/- ##
=============================================================
- Coverage 49.83% 49.66% -0.18%
=============================================================
Files 162 162
Lines 11170 11191 +21
=============================================================
- Hits 5567 5558 -9
- Misses 5603 5633 +30
Continue to review full report at Codecov.
|
d9f8cb4
to
046e4bb
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor documentation comments.
@@ -160,6 +160,9 @@ def load_encoder_from_transformers_weights( | |||
for k, v in weights_dict.items(): | |||
if k.startswith(encoder_prefix): | |||
load_weights_dict[strings.remove_prefix(k, encoder_prefix)] = v | |||
elif k.startswith(encoder_prefix.split("-")[0]): | |||
# workaround for deberta-v2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add more detail for this comment?
guides/models/adding_models.md
Outdated
@@ -0,0 +1,70 @@ | |||
# Adding a model | |||
|
|||
`jiant` supports or can easily be exteneded to support Hugging Face Hugging Face's [Transformer models](https://huggingface.co/transformers/viewer/) since `jiant` utilizes [Auto Classes](https://huggingface.co/transformers/model_doc/auto.html) to determine the architecture of the model used based on the name of the [pretrained model](https://huggingface.co/models). To add a model not currently supported in `jiant`, follow the following steps: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typos: "exteneded", "Hugging Face Hugging Face's"
Maybe add a clarifying explanation (and let me know if this is wrong): We do use AutoModels to resolve the model in jiant, but in order to ensure the jiant pipeline works correctly (e.g. matching the correct tokenizer) and to deal with some subtle differences between models, jiant needs to know some additional information/supports specific handling for specific models, so some additional steps are needed to set up a new model from HF in jiant.
class DebertaV2MLMHead(BaseMLMHead): | ||
... | ||
```` | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Conclude with "you should now be able to ..."
* Update to Transformers v4.3.3 (#1266) * use default return_dict in taskmodels and remove hidden state context manager in models. * return hidden states in output of model wrapper * Switch to task model/head factories instead of embedded if-else statements (#1268) * Use jiant transformers model wrapper instead of if-else. Use taskmodel and head factory instead of if-else. * switch to ModelArchitectures enum instead of strings * Refactor get_output_from_encoder() to be member of JiantTaskModel (#1283) * refactor getting output from encoder to be member function of jiant model * switch to explicit encode() in jiant transformers model * fix simple runscript test * update to tokenizer 0.10.1 * Add tests for flat_strip() (#1289) * add flat_strip test * add list to test cases flat_strip * mlm_weights(), feat_spec(), flat_strip() if-else refactors (#1288) * moves remaining if-else statments to jiant model or replaces with model agnostic method * switch from jiant_transformers_model to encoder * fix bug in flat_strip() * Move tokenization logic to central JiantModelTransformers method (#1290) * move model specific tokenization logic to JiantTransformerModels * implement abstract methods for JiantTransformerModels * fix tasks circular import (#1296) * Add DeBERTa (#1295) * Add DeBERTa with sanity test * fix tasks circular import * [WIP] add deberta tests * Revert "fix tasks circular import" This reverts commit f924640. * deberta tests passing with transformers 6472d8 * switch to deberta-v2 * fix get_mlm_weights_dict() for deberta-v2 * update to transformers 4.5.0 * mark deberta test_export as slow * Update test_tokenization_normalization.py * add guide to add a model * fix test_expor_model tests * minor pytest fixes (add num_labels for rte, overnight flag fix) * bugfix for simple api notebook * bugfix for #1310 * bugfix for #1306: simple api notebook path name * squad running * 2nd bugfix for #1310: not all tasks have num_labels property * simple api notebook back to roberta-base * run test matrix for more steps to compare to master * save last/best model test fix Co-authored-by: Jesse Swanson <js11133Wnyu.edu>
The PR adds DeBERTa V2 support to
jiant
. The PR also includes documentation for how to add a model with the latest changes injs/feature/easy_add_model
.