Latest 2605 #2

Biswajit2902 · 2024-05-26T16:54:16Z

What does this PR do?

Fixes # (issue)

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline,
Pull Request section?
Was this discussed/approved via a Github issue or the forum? Please add a link
to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

Fix GroundingDINO, DPR after BET SDPA update

) * Decode b64encode and encodebytes strings * Remove conditional encode -- image is always a string

* Fixed SegGptImageProcessor to handle 2D and 3D prompt mask inputs * Added new test to check prompt mask equivalence * New proposal * Better proposal * Removed unnecessary method * Updated seggpt docs * Introduced do_convert_rgb * nits

* Allow boolean FSDP options in fsdp_config * Use lower() to be safe

…30507) * Pass attn_implementation when using AutoXXX.from_config * Fix

Co-authored-by: Clint Adams <clint@debian.org>

fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

…0442) * Reenable SDPA's FA2 during training with torch.compile * fix Olmo's SDPA FA2 dispatching too * update formatting * improved SDPA comment * formatting and explanatory comment * is_causal if statement to one-liner

* Include safetensors * Cleanup

pass use_cache in kwargs

* feat: support for dinov2 * feat: support for depth_anything * feat: support for efficientformer * feat: support for bert (is this right?) * update: embedding split * remove: empty string * feat: support for align * fix: copies * fix: QAQBertEmbeddings * fix: more consistency issues * revert: support for effientformer * feat: support for altclip * feat: support for blip_text * support for ChineseCLIP * feat: support for depth anything * feat: support for dpt * feat: support for dpt * feat: support for git * feat: support for groupvit * update: format * fix: support for clip * fix: consistency * feat: support for pvt * feat: support for vit_msn * fix: consistency * fix: other copies * remove: device transfer * revert: in-place add * update: support for align * update: support for bert * update: support for Chinese CLIP * revert: changes to efficientformer * update: support for dpt * update: support for efficientformer * revert: changes to git * revert: changes to groupvit * revert: changes to roc_bert * update: support for vit_msn * revert: changes to dpt * remove: extra space * style: extra space

* fix seq2seq data collator to respect the given padding strategy further added tests for the seq2seq data collator in the style of the `data_collator_for_token_classification` (pt, tf, np) * formatting and change bool equals "==" to "is" * add missed return types in tests * update numpy test as it can handle unequal shapes, not like pt or tf

* add_blip_get_multimodal_feautres * Fix docstring error * reimplement get_multimodal_features * fix error * recheck code quality * add new necessary tests

huggingface#30558) * added chat templating support for keydataset in generation pipeline * fixed and improved test * fix formatting test failures * Fix tests * Fix tests

* fix doctest * fix torch doctest * make CI happy * raise error * make fixup

* More general PR slow CI * Update utils/pr_slow_ci_models.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* fix * add test --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

use text config's vocab size

…e#30410) * move scaling to nn.Module * let the test be here for now (need to fix) * failing tests * last failing models * Revert commit 4c14817 * clean-up * oops forgot * codestyle * raise NotImplemented when possible * Update tests/test_modeling_common.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * skip tests in respective modeling files --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* fix marian model coversion * uncomment that line * remove unnecessary code * revert tie_weights, doesn't hurt

* Temporarily silence warnings in apply_chat_template until we can properly deprecate default chat templates * make fixup * Move the default chat template warning into apply_chat_template itself * make fixup

* Handle cases when CLS token is absent * Use BOS token as a fallback

remove example

Fix --model_type in examples

* Gemma: only display act. warning when necessary This is a nit PR, but I was confused. I got the warning even after I had changed `hidden_act` to `gelu_pytorch_tanh`, telling me that I was using the "legacy" `gelu_pytorch_tanh`. Another option is to keep the warning but change the message to say something like "`hidden_act` is ignored, please use `hidden_activation` instead. Setting Gemma's activation function to `gelu_pytorch_tanh`". * Change message, and set `config.hidden_activation`

* clean-up * Update src/transformers/cache_utils.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/cache_utils.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/cache_utils.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * fixup * Update tests/quantization/quanto_integration/test_quanto.py Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * Update src/transformers/generation/configuration_utils.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * more suggestions * mapping if torch available * run tests & add 'support_quantized' flag * fix jamba test * revert, will be fixed by another PR * codestyle * HQQ and versatile cache classes * final update * typo * make tests happy --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

…#30942) * refactor quant docs * delete file * rename to overview * fix * fix table * fix * add content * fix library versions * fix table * fix table * fix table * fix table * Apply suggestions from code review Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * replace to quantization_config * fix aqlm snippet * add DLAI courses * fix * fix table * fix bulet points --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

…ace#30833)

…0919) add torch.compile dynamic support

* Change in quantization docs * Update overview.md * Update docs/source/en/quantization/overview.md Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> --------- Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>

* Fix accelerate tests * fix clip * skip dbrx tests * fix GPTSan * fix M2M100Model * same fix as jamba * fix mt5 * Fix T5Model * Fix umt5 model * fix switch_transformers * fix whisper * fix gptsan again * fix siglip recent test * skip siglip tests * wrong place fixed

…#30774) * add xpu check * add marker * add documentation * update doc * fix ci * remove from global init * fix

* Add a check that warmup_setps is either 0 or >= 1 Update training_args.py to add a check that warmup_setps is either 0 or >= 1. Otherwise, raise an error. * Update src/transformers/training_args.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* fix * fix * fix * fix * fix * [run-slow] mpt --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

* chore: initial commit * chore: adding imports and inits * chore: adding the causal and classification code * chore: adding names to the layers * chore: using single self attn layer * chore: built the model and layers * chore: start with testing * chore: docstring change, transpose fix * fix: rotary embedding * chore: adding cache implementation * remove unused torch * chore: fixing the indexing issue * make fix-copies * Use modeling_tf_utils.keras * make fixup * chore: fixing tests * chore: adding past key value logic * chore: adding multi label classfication test * fix: switching on the built parameters in the layers * fixing repo consistency * ruff formats * style changes * fix: tf and pt equivalence * removing returns from docstrings * fix docstrings * fix docstrings * removing todos * fix copies * fix docstring * fix docstring * chore: using easier rotate_half * adding integration tests * chore: addressing review related to rotary embedding layer * review changes * [run-slow] mistral * skip: test save load after resize token embedding * style --------- Co-authored-by: Matt <rocketknight1@gmail.com>

…_nllb_fast.py (huggingface#29834) * Fix typo in tokenization_nllb.py Change `adder_tokens_decoder` into `added_tokens_decoder` and improve the warning's readability. * Fix typo in tokenization_nllb_fast.py Change `adder_tokens_decoder` into `added_tokens_decoder` and improve the warning's readability. * Remove deprecated attributes in tokenization_nllb.py Remove deprecated attributes: `lang_code_to_id`, `fairseq_tokens_to_ids`, `id_to_lang_code`, and `fairseq_ids_to_tokens` * Remove deprecated attribute in tokenization_nllb_fast.py Remove deprecated attribute `lang_code_to_id` * Remove deprecated properties in tokenization_nllb.py Remove deprecated properties - fix format * Remove deprecated properties in tokenization_nllb_fast.py Remove deprecated properties - fix format * Update test_tokenization_nllb.py * update test_tokenization_nllb.py * Update tokenization_nllb.py * Update test_tokenization_seamless_m4t.py * Update test_tokenization_seamless_m4t.py

…0897) * fix wandb always uploading initial model * Update comment. * Optionally log initial model * Revert "Optionally log initial model" This reverts commit 9602cc1fad3feaf218f82a7339a194d3d2fbb946.

* add prefix space ignored in llama huggingface#29625 * adding test with add_prefix_space=False * ruff --------- Co-authored-by: Ita Zaporozhets <itazaporozhets@Itas-MBP.localdomain>

…ating pos_bias in LayoutLM v2, v3 (huggingface#26139)" (huggingface#30988) * Revert "optimize VRAM for calculating pos_bias in LayoutLM v2, v3 (huggingface#26139)" This reverts commit a7e0ed8. * Instead of reverting commit, wrap indexing in torch.no_grad context * Apply wrapping in LayoutLMv2 * Add comments explaining reason for no_grad * Fix code format --------- Co-authored-by: Kevin Koehncke <kevin.koehncke@uipath.com>

* fix * [push-ci-image] * run with latest --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

* add test that currently fails * test passed * all perceiver passed * fixup, style, quality, repo-consistency, all passed * Apply suggestions from code review: default to False + compute sqrt once only Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * fix a minor bracket * replace dim with self._num_channels * add arguments to the rest preprocessors --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

) * enable on xpu * fix style * add comment and mps

fix awq mistral test

* allow multi-gpu * allow multi-gpu --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

* Fix resume_download future warning * better like this * Add regression test

* Fix remaining quant tests * Update test_quanto.py

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

…ngface#30732) * added interpolation for vitmae model in pytorch as well as tf. * Update modeling_vit_mae.py irreugalr import fixed * small changes and proper formatting * changes suggested in review. * modified decoder interpolate_func * arguments and docstring fix * Apply suggestions from code review doc fixes Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* seems like `split_special_tokens` is used here * split special token * add new line at end of file * moving split special token test to common tests * added assertions * test * fixup * add co-author * passing rest of args to gptsan_japanese, fixing tests * removing direct comparison of fast and slow models * adding test support for UDOP and LayoutXLM * ruff fix * readd check if slow tokenizer * modify test to handle bos tokens * removing commented function * trigger build * applying review feedback - updated docstrings, var names, and simplified tests * ruff fixes * Update tests/test_tokenization_common.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * applying feedback, comments * shutil temp directory fix --------- Co-authored-by: Arthur Zucker <arthur.zucker@gmail.com> Co-authored-by: Ita Zaporozhets <itazaporozhets@Itas-MBP.localdomain> Co-authored-by: itazap <itazap@users.noreply.github.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by: Ita Zaporozhets <itazaporozhets@Itas-MacBook-Pro.local>

* fix devices and dtype assignments * [run-slow]paligemma

* Cohere Model Release (#1) Cohere Model Release * Remove unnecessary files and code (#2) Some cleanup * Delete cohere-model directory (#3) * Make Fix (huggingface#5) * Pr fixes (huggingface#6) * fixes for pr * pr fixes for the format * pr fixes for the format * src/transformers/models/auto/tokenization_auto.py * Tokenizer test (huggingface#8) * tokenizer test * format fix * Adding Docs and other minor changes (huggingface#7) * Add modeling tests (huggingface#9) * Smol Fix (huggingface#11) * tokenization tests are fixed * format fixes * fix pr doc tests * fix pr doc tests * fix pr doc tests * fix pr style check * small changes in cohere.md * FIX: Address final comments for transformers integration (huggingface#13) * fix modeling final nits and add proper test file * for now leave empty tests * add integration test * push new test * fix modeling cohere (huggingface#14) * Update chat templates to use the new API (huggingface#15) --------- Co-authored-by: ahmetustun <ahmetustun89@gmail.com> Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>

amyeroberts and others added 30 commits April 26, 2024 18:04

Fix GroundingDINO, DPR after BERT SDPA update (huggingface#30506)

e7d52a1

Fix GroundingDINO, DPR after BET SDPA update

load_image - decode b64encode and encodebytes strings (huggingface#30192

c793b26

) * Decode b64encode and encodebytes strings * Remove conditional encode -- image is always a string

Fix link in dbrx.md (huggingface#30509)

73014b5

Allow boolean FSDP options in fsdp_config (huggingface#30439)

80126f9

* Allow boolean FSDP options in fsdp_config * Use lower() to be safe

Pass attn_implementation when using AutoXXX.from_config (huggingface#…

e8acb70

…30507) * Pass attn_implementation when using AutoXXX.from_config * Fix

Fix broken link to Transformers notebooks (huggingface#30512)

bdbe166

Co-authored-by: Clint Adams <clint@debian.org>

Update runner tag for PR slow CI (huggingface#30535)

c024218

fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

Fix repo. fetch/checkout in PR slow CI job (huggingface#30537)

87be06c

fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

Include safetensors as part of _load_best_model (huggingface#30553)

a3aabc7

* Include safetensors * Cleanup

Pass use_cache in kwargs for GPTNeoX (huggingface#30538)

c712d05

pass use_cache in kwargs

Cache: Static cache as a standalone object (huggingface#30476)

75bbfd5

Generate: update links on LLM tutorial doc (huggingface#30550)

1bff6a0

DBRX: make fixup (huggingface#30578)

78a57c5

BlipModel: get_multimodal_features method (huggingface#30438)

0cdb6b3

* add_blip_get_multimodal_feautres * Fix docstring error * reimplement get_multimodal_features * fix error * recheck code quality * add new necessary tests

Add chat templating support for KeyDataset in text-generation pipeline (

2ecefc3

huggingface#30558) * added chat templating support for keydataset in generation pipeline * fixed and improved test * fix formatting test failures * Fix tests * Fix tests

Fix generation doctests (huggingface#30263)

b8ac4d0

* fix doctest * fix torch doctest * make CI happy * raise error * make fixup

Remove use_square_size after loading (huggingface#30567)

78fdd64

* fix * add test --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

Use text config's vocab size in testing models (huggingface#30568)

9d31b32

use text config's vocab size

Fix Marian model conversion (huggingface#30173)

4bc9cb3

* fix marian model coversion * uncomment that line * remove unnecessary code * revert tie_weights, doesn't hurt

Refactor default chat template warnings (huggingface#30551)

4b4da18

* Temporarily silence warnings in apply_chat_template until we can properly deprecate default chat templates * make fixup * Move the default chat template warning into apply_chat_template itself * make fixup

Fix QA example (huggingface#30580)

1e05671

* Handle cases when CLS token is absent * Use BOS token as a fallback

remove jax example (huggingface#30498)

3c69d81

remove example

Fix canonical model --model_type in examples (huggingface#30480)

bbaa8ce

Fix --model_type in examples

zucchini-nlp and others added 28 commits May 23, 2024 17:25

test_custom_4d_attention_mask skip with sliding window attn (huggingf…

6739e1d

…ace#30833)

Finish adding support for torch.compile dynamic shapes (huggingface#3…

046c2ad

…0919) add torch.compile dynamic support

[tests] add torch.use_deterministic_algorithms for XPU (huggingface…

21339a5

…#30774) * add xpu check * add marker * add documentation * update doc * fix ci * remove from global init * fix

Update 4 MptIntegrationTests expected outputs (huggingface#30989)

2a89673

* fix * fix * fix * fix * fix * [run-slow] mpt --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

Bugfix: WandbCallback uploads initial model checkpoint (huggingface#3…

6657fb5

…0897) * fix wandb always uploading initial model * Update comment. * Optionally log initial model * Revert "Optionally log initial model" This reverts commit 9602cc1fad3feaf218f82a7339a194d3d2fbb946.

add prefix space ignored in llama huggingface#29625 (huggingface#30964)

7f6e874

* add prefix space ignored in llama huggingface#29625 * adding test with add_prefix_space=False * ruff --------- Co-authored-by: Ita Zaporozhets <itazaporozhets@Itas-MBP.localdomain>

Do not trigger autoconversion if local_files_only (huggingface#31004)

03935d3

pin uv==0.1.45 (huggingface#31006)

5855afd

* fix * [push-ci-image] * run with latest --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

[tests] make test_model_parallelism device-agnostic (huggingface#30844

04c7c17

) * enable on xpu * fix style * add comment and mps

FIX / TST: Fix expected results on Mistral AWQ test (huggingface#30971)

ae87f97

fix awq mistral test

allow multi-gpu (huggingface#31011)

acbfaf6

* allow multi-gpu * allow multi-gpu --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

Fix resume_download future warning (huggingface#31007)

fd3c128

* Fix resume_download future warning * better like this * Add regression test

Quantization / TST: Fix remaining quantization tests (huggingface#31000)

658b849

* Fix remaining quant tests * Update test_quanto.py

save the list of new model failures (huggingface#31013)

a3cdff4

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

Paligemma- fix devices and dtype assignments (huggingface#31008)

bdb9106

* fix devices and dtype assignments * [run-slow]paligemma

updated initial prompt implementation

49d44ec

Merge branch 'main' into latest-2605

536ac70

Biswajit2902 merged commit b2e8427 into main May 26, 2024
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Latest 2605 #2

Latest 2605 #2

Biswajit2902 commented May 26, 2024

Latest 2605 #2

Latest 2605 #2

Conversation

Biswajit2902 commented May 26, 2024

What does this PR do?

Before submitting

Who can review?