PretrainedModel: made `_load_pretrained_model_low_mem` static + bug fix #16548

FrancescoSaverioZuppichini · 2022-04-01T14:13:19Z

What does this PR do?

This PR makes PretrainedModel._load_pretrained_model_low_mem a staticmethod since it's not returning an instance of PretrainedModel. I've also fixed a bug inside the method itself, the main for loop was wrong.

HuggingFaceDocBuilderDev · 2022-04-01T14:27:18Z

The documentation is not available anymore as the PR was closed or merged.

patil-suraj

Great catch, thanks for the fix!

sgugger

Thanks for fixing! Cc-ing @stas00 who wrote the original method to double-check the classmethod here was just a typo (the error in the loop is mine, this is definitely the right fix ;-) )

patil-suraj · 2022-04-05T09:55:46Z

Merging this, since low_cpu_mem_usage is not working on master right now and is required for to test the addition of some new models.

@stas feel free to still leave a comment about the classmethod change :)

* 📝 add image/vision classification and asr * 🖍 minor formatting fixes * Fixed a typo in legacy seq2seq_trainer.py (#16531) * Add ONNX export for BeiT (#16498) * Add beit onnx conversion support * Updated docs * Added cross reference to ViT ONNX config * call on_train_end when trial is pruned (#16536) * Type hints added (#16529) * Fix Bart type hints (#16297) * Add type hints to PLBart PyTorch * Remove pending merge conflicts * Fix PLBart Type Hints * Add changes from review * Add VisualBert type hints (#16544) * Adding missing type hints for mBART model (PyTorch) (#16429) * added type hints for mbart tensorflow tf implementation * Adding missing type hints for mBART model Tensorflow Implementation model added with missing type hints * Missing Type hints - correction For TF model * Code fixup using make quality tests * Hint types - typo error * make fix-copies and make fixup * type hints * updated files * type hints update * making dependent modesls coherent Co-authored-by: matt <rocketknight1@gmail.com> * Remove MBart subclass of XLMRoberta in tokenzier docs (#16546) * Remove MBart subclass of XLMRoberta in tokenzier * Fix style * Copy docs from MBart50 tokenizer * Use random_attention_mask for TF tests (#16517) * use random_attention_mask for TF tests * Fix for TFCLIP test (for now). Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * Improve code example (#16450) Co-authored-by: Niels Rogge <nielsrogge@nielss-mbp.home> * Pin tokenizers version <0.13 (#16539) * Pin tokenizers version <0.13 * Style * Add code samples for TF speech models (#16494) Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * [FlaxSpeechEncoderDecoder] Fix dtype bug (#16581) * [FlaxSpeechEncoderDecoder] Fix dtype bug * more fixes * Making the impossible to connect error actually report the right URL. (#16446) * Fix flax import in __init__.py: modeling_xglm -> modeling_flax_xglm (#16556) * Add utility to find model labels (#16526) * Add utility to find model labels * Use it in the Trainer * Update src/transformers/utils/generic.py Co-authored-by: Matt <Rocketknight1@users.noreply.github.com> * Quality Co-authored-by: Matt <Rocketknight1@users.noreply.github.com> * Enable doc in Spanish (#16518) * Reorganize doc for multilingual support * Fix style * Style * Toc trees * Adapt templates * Add use_auth to load_datasets for private datasets to PT and TF examples (#16521) * fix formatting and remove use_auth * Add use_auth_token to Flax examples * add a test checking the format of `convert_tokens_to_string`'s output (#16540) * add new tests * add comment to overridden tests * TF: Finalize `unpack_inputs`-related changes (#16499) * Add unpack_inputs to remaining models * removed kwargs to `call()` in TF models * fix TF T5 tests * [SpeechEncoderDecoderModel] Correct Encoder Last Hidden State Output (#16586) * initialize the default rank set on TrainerState (#16530) * initialize the default rank set on TrainerState * fix style * Trigger doc build * Fix CI: test_inference_for_pretraining in ViTMAEModelTest (#16591) Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * add a template to add missing tokenization test (#16553) * add a template to add missing tokenization test * add cookiecutter setting * improve doc * Update templates/adding_a_missing_tokenization_test/README.md Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * made _load_pretrained_model_low_mem static + bug fix (#16548) * handle torch_dtype in low cpu mem usage (#16580) * [Doctests] Correct filenaming (#16599) * [Doctests] Correct filenaming * improve quicktour * make style * Adding new train_step logic to make things less confusing for users (#15994) * Adding new train_step logic to make things less confusing for users * DO NOT ASK WHY WE NEED THAT SUBCLASS * Metrics now working, at least for single-output models with type annotations! * Updates and TODOs for the new train_step * Make fixup * Temporary test workaround until T5 has types * Temporary test workaround until T5 has types * I think this actually works! Needs a lot of tests though * MAke style/quality * Revert changes to T5 tests * Deleting the aforementioned unmentionable subclass * Deleting the aforementioned unmentionable subclass * Adding a Keras API test * Style fixes * Removing unneeded TODO and comments * Update test_step too * Stop trying to compute metrics with the dummy_loss, patch up test * Make style * make fixup * Docstring cleanup * make fixup * make fixup * Stop expanding 1D input tensors when using dummy loss * Adjust T5 test given the new compile() * make fixup * Skipping test for convnext * Removing old T5-specific Keras test now that we have a common one * make fixup * make fixup * Only skip convnext test on CPU * Update src/transformers/modeling_tf_utils.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/modeling_tf_utils.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Avoiding TF import issues * make fixup * Update compile() to support TF 2.3 * Skipping model.fit() on template classes for now * Skipping model.fit() on template class tests for now * Replace ad-hoc solution with find_labels * make fixup Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Adding missing type hints for BigBird model (#16555) * added type hints for mbart tensorflow tf implementation * Adding missing type hints for mBART model Tensorflow Implementation model added with missing type hints * Missing Type hints - correction For TF model * Code fixup using make quality tests * Hint types - typo error * make fix-copies and make fixup * type hints * updated files * type hints update * making dependent modesls coherent * Type hints for BigBird * removing typos Co-authored-by: matt <rocketknight1@gmail.com> * [deepspeed] fix typo, adjust config name (#16597) * 🖍 apply feedback Co-authored-by: Cathy <815244047@qq.com> Co-authored-by: Jim Rohrer <jrohrer1@gmail.com> Co-authored-by: Ferdinand Schlatt <fschlatt@gmail.com> Co-authored-by: Dahlbomii <101373053+Dahlbomii@users.noreply.github.com> Co-authored-by: Gunjan Chhablani <chhablani.gunjan@gmail.com> Co-authored-by: Rishav Chandra Varma <rishavchandra.v16@iiits.in> Co-authored-by: matt <rocketknight1@gmail.com> Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com> Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: Niels Rogge <nielsrogge@nielss-mbp.home> Co-authored-by: Lysandre Debut <lysandre.debut@reseau.eseo.fr> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com> Co-authored-by: Daniel Stancl <46073029+stancld@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Matt <Rocketknight1@users.noreply.github.com> Co-authored-by: Karim Foda <35491698+KMFODA@users.noreply.github.com> Co-authored-by: SaulLu <55560583+SaulLu@users.noreply.github.com> Co-authored-by: Joao Gante <joao@huggingface.co> Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> Co-authored-by: Andres Codas <andrescodas@users.noreply.github.com> Co-authored-by: Sylvain Gugger <Sylvain.gugger@gmail.com> Co-authored-by: Francesco Saverio Zuppichini <francesco.zuppichini@gmail.com> Co-authored-by: Suraj Patil <surajp815@gmail.com> Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

made _load_pretrained_model_low_mem static + bug fix

26b8634

FrancescoSaverioZuppichini requested review from sgugger and patil-suraj April 1, 2022 14:13

FrancescoSaverioZuppichini changed the title ~~PretrainedModel: made _load_pretrained_model_low_mem static + bug fix~~ PretrainedModel: made _load_pretrained_model_low_mem static + bug fix Apr 1, 2022

patil-suraj approved these changes Apr 4, 2022

View reviewed changes

sgugger approved these changes Apr 4, 2022

View reviewed changes

patil-suraj mentioned this pull request Apr 4, 2022

handle torch_dtype in low cpu mem usage #16580

Merged

patil-suraj merged commit 8bf6d28 into main Apr 5, 2022

patil-suraj deleted the _load_pretrained_model_low_mem branch April 5, 2022 09:56

stevhliu pushed a commit to stevhliu/transformers that referenced this pull request Apr 5, 2022

made _load_pretrained_model_low_mem static + bug fix (huggingface#16548)

e98825c

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PretrainedModel: made `_load_pretrained_model_low_mem` static + bug fix #16548

PretrainedModel: made `_load_pretrained_model_low_mem` static + bug fix #16548

FrancescoSaverioZuppichini commented Apr 1, 2022 •

edited

Loading

HuggingFaceDocBuilderDev commented Apr 1, 2022 •

edited

Loading

patil-suraj left a comment

sgugger left a comment

patil-suraj commented Apr 5, 2022 •

edited

Loading

PretrainedModel: made _load_pretrained_model_low_mem static + bug fix #16548

PretrainedModel: made _load_pretrained_model_low_mem static + bug fix #16548

Conversation

FrancescoSaverioZuppichini commented Apr 1, 2022 • edited Loading

What does this PR do?

HuggingFaceDocBuilderDev commented Apr 1, 2022 • edited Loading

patil-suraj left a comment

Choose a reason for hiding this comment

sgugger left a comment

Choose a reason for hiding this comment

patil-suraj commented Apr 5, 2022 • edited Loading

PretrainedModel: made `_load_pretrained_model_low_mem` static + bug fix #16548

PretrainedModel: made `_load_pretrained_model_low_mem` static + bug fix #16548

FrancescoSaverioZuppichini commented Apr 1, 2022 •

edited

Loading

HuggingFaceDocBuilderDev commented Apr 1, 2022 •

edited

Loading

patil-suraj commented Apr 5, 2022 •

edited

Loading