-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ValueError: OLMoForCausalLM does not support Flash Attention 2.0 yet #29145 #1
Commits on Jan 17, 2024
-
* add config, modeling, and tokenization * add auto and init * update readme * update readme * update team name * fixup * fixup * update config * update code style * update for fixup * update for fixup * update for fixup * update for testing * update for testing * fix bug for config and tokenization * fix bug for bos token * not doctest * debug tokenizer * not doctest * debug tokenization * debug init for tokenizer * fix style * update init * delete if in token auto * add tokenizer doc * add tokenizer in init * Update dummy_tokenizers_objects.py * update * update * debug * Update tokenization_qwen2.py * debug * Update convert_slow_tokenizer.py * add copies * add copied from and make style * update files map * update test * fix style * fix merge reading and update tests * fix tests * fix tests * fix style * debug a variable in readme * Update src/transformers/models/qwen2/configuration_qwen2.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * update test and copied from * fix style * update qwen2 tokenization and tests * Update tokenization_qwen2.py * delete the copied from after property * fix style * update tests * update tests * add copied from * fix bugs * update doc * add warning for sliding window attention * update qwen2 tokenization * fix style * Update src/transformers/models/qwen2/modeling_qwen2.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * fix tokenizer fast --------- Co-authored-by: Ren Xuancheng <jklj077@users.noreply.github.com> Co-authored-by: renxuancheng.rxc <renxuancheng.rxc@alibaba-inc.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
Configuration menu - View commit details
-
Copy full SHA for d6ffe74 - Browse repository at this point
Copy the full SHA d6ffe74View commit details -
* skip bf16 test if not supported by device * fix * fix bis * use is_torch_bf16_available_on_device * use is_torch_fp16_available_on_device * fix & use public llama * use 1b model * fix flacky test --------- Co-authored-by: Your Name <you@example.com>
Configuration menu - View commit details
-
Copy full SHA for 2c1eebc - Browse repository at this point
Copy the full SHA 2c1eebcView commit details -
Allow to train dinov2 with different dtypes like bf16 (#28504)
I want to train dinov2 with bf16 but I get the following error in https://github.com/huggingface/transformers/blob/bc72b4e2cdcbc80d5f56731f35dbc9c18b4c8de6/src/transformers/models/dinov2/modeling_dinov2.py#L635: ``` RuntimeError: Input type (float) and bias type (c10::BFloat16) should be the same ``` Since the input dtype is torch.float32, the parameter dtype has to be torch.float32... @LZHgrla and I checked the code of clip vision encoder and found there is an automatic dtype transformation (https://github.com/huggingface/transformers/blob/bc72b4e2cdcbc80d5f56731f35dbc9c18b4c8de6/src/transformers/models/clip/modeling_clip.py#L181-L182). So I add similar automatic dtype transformation to modeling_dinov2.py.
Configuration menu - View commit details
-
Copy full SHA for fa6d12f - Browse repository at this point
Copy the full SHA fa6d12fView commit details -
Fix Switch Transformers When sparse_step = 1 (#28564)
Fix sparse_step = 1 I case sparse_step = 1, the current code will not work.
Configuration menu - View commit details
-
Copy full SHA for 98dda8e - Browse repository at this point
Copy the full SHA 98dda8eView commit details
Commits on Jan 18, 2024
-
* save processor * Update tests/models/auto/test_processor_auto.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update tests/test_processing_common.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
Configuration menu - View commit details
-
Copy full SHA for 3005f96 - Browse repository at this point
Copy the full SHA 3005f96View commit details -
Use
weights_only
only if torch >= 1.13 (#28506)* fix * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Configuration menu - View commit details
-
Copy full SHA for a1668cc - Browse repository at this point
Copy the full SHA a1668ccView commit details -
[
Core Tokenization
] Support a fix for spm fast models (#26678)* fix * last attempt * current work * fix forward compatibility * save all special tokens * current state * revert additional changes * updates * remove tokenizer.model * add a test and the fix * nit * revert one more break * fix typefield issue * quality * more tests * fix fields for FC * more nits? * new additional changes * how * some updates * the fix * where do we stand * nits * nits * revert unrelated changes * nits nits nits * styling * don't break llama just yet * revert llama changes * safe arg check * fixup * Add a test for T5 * Necessary changes * Tests passing, added tokens need to not be normalized. If the added tokens are normalized, it will the stripping which seems to be unwanted for a normal functioning * Add even more tests, when normalization is set to True (which does not work 😓 ) * Add even more tests, when normalization is set to True (which does not work 😓 ) * Update to main * nits * fmt * more and more test * comments * revert change as tests are failing * make the test more readble * nits * refactor the test * nit * updates * simplify * style * style * style convert slow * Update src/transformers/convert_slow_tokenizer.py
Configuration menu - View commit details
-
Copy full SHA for 8189977 - Browse repository at this point
Copy the full SHA 8189977View commit details -
Configuration menu - View commit details
-
Copy full SHA for 5d8eb93 - Browse repository at this point
Copy the full SHA 5d8eb93View commit details -
Add new meta w2v2-conformer BERT-like model (#28165)
* first commit * correct default value non causal * update config and modeling code * update converting checkpoint * clean modeling and fix tests * make style * add new config parameters to docstring * fix copied from statements * Apply suggestions from code review Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> * make position_embeddings_type docstrings clearer * clean converting script * remove function not used * clean modeling file * apply suggestion for test file + add convert script to not_doctested * modify tests according to review - cleaner logic and more tests * Apply nit suggestions from code review Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * add checker of valid position embeddings type * instantiate new layer norm layer with the right eps * fix freeze_feature_encoder since it can be None in some cases * add test same output in convert script * restore wav2vec2conformer and add new model * create processor and FE + clean * add new model code * fix convert script and set default config parameters * correct model id paths * make style * make fix-copies and cleaning files * fix copied from statements * complete .md and fixe copies * clean convert script argument defaults * fix config parameters docstrings * fix config docstring * add copied from and enrich FE tests * fix copied from and repo-consistency * add autotokenizer * make test input length shorter and change docstring code * fix docstrings and copied from * add add_adapter to ASR training example * make testing of adapters more robust * adapt to multi adapter layers * refactor input_values->input_features and remove w2v2-bert feature extractor * remove pretraining model * remove depreciated features and useless lines * add copied from and ignore statements to modeling tests * remove pretraining model #2 * change import in convert script * change default in convert script * update readme and remove useless line * Update tests/models/wav2vec2_bert/test_processor_wav2vec2_bert.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * refactor BERT to Bert for consistency * remove useless ignore copy statement * add persistent to buffer in rotary * add eps in LayerNorm init and remove copied from * add adapter activation parameters and add copied from statements * Fix copied statements and add unitest.skip reasons * add copied statement in test_processor * refactor processor * make style * replace numpy random by torch rand * remove expected output CTC * improve converting script with processor class * Apply suggestions from code review Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * remove gumbel class * remove tests related to previously deleted class * Update src/transformers/models/wav2vec2_bert/configuration_wav2vec2_bert.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * correct typos * remove uused parameters * update processor to takes both text and audio * update checkpoints * update expected output and add ctc expected output * add label_attention_mask * replace pt with np in processor tests * fix typo * revert to behaviour with labels_attention_mask --------- Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
Configuration menu - View commit details
-
Copy full SHA for d2cdefb - Browse repository at this point
Copy the full SHA d2cdefbView commit details -
Use
LoggingLevel
context manager in 3 tests (#28575)* inside with LoggingLevel * remove is_flaky --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Configuration menu - View commit details
-
Copy full SHA for 0754217 - Browse repository at this point
Copy the full SHA 0754217View commit details -
Fix the documentation checkpoint for xlm-roberta-xl (#28567)
* Fix the documentation checkpoint for xlm-roberta-xl * Improve docstring consistency
Configuration menu - View commit details
-
Copy full SHA for c662c78 - Browse repository at this point
Copy the full SHA c662c78View commit details -
[ASR Pipe] Update init to set model type and subsequently call parent…
… init method (#28486) * add image processor arg * super * rm args
Configuration menu - View commit details
-
Copy full SHA for 0eaa5ea - Browse repository at this point
Copy the full SHA 0eaa5eaView commit details -
[Whisper Tok] Move token ids to CPU when computing offsets (#28485)
* move token ids to cpu * check for torch attr
Configuration menu - View commit details
-
Copy full SHA for 619ecfe - Browse repository at this point
Copy the full SHA 619ecfeView commit details -
[Whisper] Fix audio classification with weighted layer sum (#28563)
* fix * tests * fix test
Configuration menu - View commit details
-
Copy full SHA for 186aa6b - Browse repository at this point
Copy the full SHA 186aa6bView commit details -
Making CTC training example more general (#28582)
* add w2v2bert compatibility * Update examples/pytorch/speech-recognition/run_speech_recognition_ctc.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
Configuration menu - View commit details
-
Copy full SHA for 772307b - Browse repository at this point
Copy the full SHA 772307bView commit details
Commits on Jan 19, 2024
-
Don't save
processor_config.json
if a processor has no extra attrib……ute (#28584) * not save if empty * fix * fix * fix * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Configuration menu - View commit details
-
Copy full SHA for db9a7e9 - Browse repository at this point
Copy the full SHA db9a7e9View commit details -
Configuration menu - View commit details
-
Copy full SHA for b2748a6 - Browse repository at this point
Copy the full SHA b2748a6View commit details -
Add w2v2bert to pipeline (#28585)
* generalize asr pipeline to fbank models * change w2v2 pipeline output * Update test_pipelines_automatic_speech_recognition.py
Configuration menu - View commit details
-
Copy full SHA for 268fc1f - Browse repository at this point
Copy the full SHA 268fc1fView commit details -
Configuration menu - View commit details
-
Copy full SHA for d4fc1eb - Browse repository at this point
Copy the full SHA d4fc1ebView commit details -
[Whisper] Finalize batched SOTA long-form generation (#27658)
* finalize * make fix copies whisper * [Tests] Make sure that we don't run tests mulitple times * Update src/transformers/models/whisper/modeling_whisper.py * [Tests] Make sure that we don't run tests mulitple times * fix more * improve * improve * improve further * improve more * improve * fix more * git commit and git push * fix more * fix more * fix more * New try * Fix more whisper stuff * Improve * correct more * correct more * correct more * Fix some tests * Add more tests * correct more * correct more * correct more * push * correct more * Fix more * Better * without dec mask * correct more * clean * save intermediate * Fix more * Fix VAD for large-v2 * Save new * Correct more * make cleaner * correct tests * correct src * Finish * Fix more * Fix more * finish * Fix edge cases * fix return_dict_in_generate * fix all tests * make style * add docstrings * add docstrings * Fix logit processor * make style * fix pipeline test * fix more style * Apply suggestions from code review * apply feedback Sanchit * correct more * Apply suggestions from code review Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> * correct more * correct more * correct more * Fix staticmethod * correct more * fix * fix slow tests * make style * fix tokenizer test * fix tokenizer test * Apply suggestions from code review Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * finish * finish * revert kwargs change --------- Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
Configuration menu - View commit details
-
Copy full SHA for 690fe73 - Browse repository at this point
Copy the full SHA 690fe73View commit details -
Fix wrong xpu device in DistributedType.MULTI_XPU mode (#28386)
* remove elif xpu * remove redudant code
Configuration menu - View commit details
-
Copy full SHA for 8db6436 - Browse repository at this point
Copy the full SHA 8db6436View commit details -
Configuration menu - View commit details
-
Copy full SHA for faf0354 - Browse repository at this point
Copy the full SHA faf0354View commit details -
[
Llava
] Fix convert_llava_weights_to_hf.py script (#28570)* Update convert_llava_weights_to_hf.py Fix call to `tokenizer.add_tokens` * Add special_tokens to tokenizer.add_tokens in convert_vipllava_weights_to_hf.py
Configuration menu - View commit details
-
Copy full SHA for 5b7f4bc - Browse repository at this point
Copy the full SHA 5b7f4bcView commit details -
Allow add_tokens for ESM (#28535)
* Allow non-special tokens to be added * Add test, fix token adding code * Revert changes to id_to_token and token_to_id * Update the ESM tokenizer to be a bit more standardized * Update src/transformers/models/esm/tokenization_esm.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
Configuration menu - View commit details
-
Copy full SHA for d157815 - Browse repository at this point
Copy the full SHA d157815View commit details -
Configuration menu - View commit details
-
Copy full SHA for 9efec11 - Browse repository at this point
Copy the full SHA 9efec11View commit details -
Configuration menu - View commit details
-
Copy full SHA for 948ffff - Browse repository at this point
Copy the full SHA 948ffffView commit details -
Fix auxiliary loss related code in transformers (#28406)
* [DETA] fix freeze/unfreeze function * Update src/transformers/models/deta/modeling_deta.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/models/deta/modeling_deta.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * add freeze/unfreeze test case in DETA * fix type * fix typo 2 * fix : enable aux and enc loss in training pipeline * Add unsynced variables from original DETA for training * modification for passing CI test * make style * make fix * manual make fix * change deta_modeling_test of configuration 'two_stage' default to TRUE and minor change of dist checking * remove print * divide configuration in DetaModel and DetaForObjectDetection * image smaller size than 224 will give topk error * pred_boxes and logits should be equivalent to two_stage_num_proposals * add missing part in DetaConfig * Update src/transformers/models/deta/modeling_deta.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * add docstring in configure and prettify TO DO part * change distribute related code to accelerate * Update src/transformers/models/deta/configuration_deta.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update tests/models/deta/test_modeling_deta.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * protect importing accelerate * change variable name to specific value * wrong import * fix aux_loss in conditional_detr * add test aux_loss * add aux_loss test in deta and table_transformer * fix yolos since it doesn't have auxiliary function * fix maskformer auxiliary_loss related code * make style * change param 'auxiliary_loss' to 'use_auxiliary_loss' * change param 'auxiliary_loss' to 'use_auxiliary_loss' in tests * make style & fix-copies, also revert yolos related parameter * revert variable name 'use_auxiliary_loss' to 'auxiliary_loss' due to DetrConfig * revert variable name in yolos * revert maskformer * add aux_loss test in maskformer * make style * Update src/transformers/models/yolos/configuration_yolos.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
Configuration menu - View commit details
-
Copy full SHA for 3f69f41 - Browse repository at this point
Copy the full SHA 3f69f41View commit details
Commits on Jan 21, 2024
-
[
GPTNeoX
] Fix BC issue with 4.36 (#28602)* fix dtype issue * add a test * update copied from mentions * nits * fixup * fix copies * Apply suggestions from code review
Configuration menu - View commit details
-
Copy full SHA for 83f9196 - Browse repository at this point
Copy the full SHA 83f9196View commit details
Commits on Jan 22, 2024
-
Configuration menu - View commit details
-
Copy full SHA for f0acf7b - Browse repository at this point
Copy the full SHA f0acf7bView commit details -
Add missing key to TFLayoutLM signature (#28640)
Fix missing bbox in LayoutLM signature
Configuration menu - View commit details
-
Copy full SHA for bf67415 - Browse repository at this point
Copy the full SHA bf67415View commit details -
Avoid root logger's level being changed (#28638)
* avoid root logger's level being changed --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Configuration menu - View commit details
-
Copy full SHA for d336c56 - Browse repository at this point
Copy the full SHA d336c56View commit details -
Add config tip to custom model docs (#28601)
Add tip to custom model docs
Configuration menu - View commit details
-
Copy full SHA for 692c3c6 - Browse repository at this point
Copy the full SHA 692c3c6View commit details -
Fix lr_scheduler in no_trainer training scripts (#27872)
* Fix lr_scheduler * Fix lr scheduler
Configuration menu - View commit details
-
Copy full SHA for deb2b59 - Browse repository at this point
Copy the full SHA deb2b59View commit details -
[
Llava
] Update convert_llava_weights_to_hf.py script (#28617)* Update convert_llava_weights_to_hf.py script * Remove config update of adding padding to `vocab_size` and `text_config.vocab_size` which causes `ValueError` exception. * Remove keys that ends with `inv_freq` from the state dict. * Add examples and instructions for creating `model_state_dict.bin` that can be used by the script. * Update convert_llava_weights_to_hf.py * Update convert_vipllava_weights_to_hf.py
Configuration menu - View commit details
-
Copy full SHA for dafd595 - Browse repository at this point
Copy the full SHA dafd595View commit details -
[
GPTNeoX
] Fix GPTNeoX + Flash Attention 2 issue (#28645)Update modeling_gpt_neox.py
Configuration menu - View commit details
-
Copy full SHA for e201864 - Browse repository at this point
Copy the full SHA e201864View commit details -
Update image_processing_deformable_detr.py (#28561)
* Update image_processing_deformable_detr.py * Changes after running make fix-copies
Configuration menu - View commit details
-
Copy full SHA for a35ea57 - Browse repository at this point
Copy the full SHA a35ea57View commit details -
[
SigLIP
] Only import tokenizer if sentencepiece available (#28636)Only import class if sp available
Configuration menu - View commit details
-
Copy full SHA for 590be77 - Browse repository at this point
Copy the full SHA 590be77View commit details -
Fix phi model doc checkpoint (#28581)
Co-authored-by: Pashmina Cameron <11311835+pashminacameron@users.noreply.github.com>
Configuration menu - View commit details
-
Copy full SHA for e547458 - Browse repository at this point
Copy the full SHA e547458View commit details
Commits on Jan 23, 2024
-
get default device through
PartialState().default_device
as it has ……been officially released (#27256) get default device through `PartialState().default_device` as it has been officially released
Configuration menu - View commit details
-
Copy full SHA for 1fc1296 - Browse repository at this point
Copy the full SHA 1fc1296View commit details -
integrations: fix DVCLiveCallback model logging (#28653)
Dave Berenbaum authoredJan 23, 2024 Configuration menu - View commit details
-
Copy full SHA for 0398660 - Browse repository at this point
Copy the full SHA 0398660View commit details -
Enable safetensors conversion from PyTorch to other frameworks withou…
…t the torch requirement (#27599) * Initial commit * Requirements & tests * Tests * Tests * Rogue import * Rogue torch import * Cleanup * Apply suggestions from code review Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com> * bfloat16 management * Sanchit's comments * Import shield * apply suggestions from code review * correct bf16 * rebase --------- Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com> Co-authored-by: sanchit-gandhi <sanchit@huggingface.co>
Configuration menu - View commit details
-
Copy full SHA for 008a6a2 - Browse repository at this point
Copy the full SHA 008a6a2View commit details -
Enable instantiating model with pretrained backbone weights (#28214)
* Enable instantiating model with pretrained backbone weights * Update tests so backbone checkpoint isn't passed in * Remove doc updates until changes made in modeling code * Clarify pretrained import * Update configs - docs and validation check * Update src/transformers/utils/backbone_utils.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Clarify exception message * Update config init in tests * Add test for when use_timm_backbone=True * Small test updates --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
Configuration menu - View commit details
-
Copy full SHA for 27c79a0 - Browse repository at this point
Copy the full SHA 27c79a0View commit details -
tensor_size
- fix copy/paste error msg typo (#28660)Fix copy/paste error msg typo
Configuration menu - View commit details
-
Copy full SHA for c475eca - Browse repository at this point
Copy the full SHA c475ecaView commit details -
Fix windows err with checkpoint race conditions (#28637)
Fix windows err
Configuration menu - View commit details
-
Copy full SHA for 582d104 - Browse repository at this point
Copy the full SHA 582d104View commit details -
add dataloader prefetch factor in training args and trainer (#28498)
* add dataloader prefetch factor in training args and trainer * remove trailing spaces * prevent dataloader_num_workers == 0 and dataloader_prefetch_factor != None dataloader_prefetch_factor works only when data is loaded in a different process as the main one. This commit adds the necessary checks to avoid having prefetch_factor set when there is no such process. * Remove whitespaces in empty line * Update src/transformers/training_args.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/training_args.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/training_args.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/training_args.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
Configuration menu - View commit details
-
Copy full SHA for 5b5e71d - Browse repository at this point
Copy the full SHA 5b5e71dView commit details -
Support single token decode for
CodeGenTokenizer
(#28628)convert token id to list in .decode()
Configuration menu - View commit details
-
Copy full SHA for 9a4521d - Browse repository at this point
Copy the full SHA 9a4521dView commit details -
Remove deprecated eager_serving fn (#28665)
* Remove deprecated eager_serving fn * Fix the input_signature docstring while I'm here
Configuration menu - View commit details
-
Copy full SHA for ebc8f47 - Browse repository at this point
Copy the full SHA ebc8f47View commit details -
fix a hidden bug of
GenerationConfig
, now the `generation_config.js……on` can be loaded successfully (#28604) * fix a hidden bug of GenerationConfig * keep `sort_keys=True` to maintain visibility * Update src/transformers/generation/configuration_utils.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update configuration_utils.py in case `obj` is a list, check the items in the list --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
Configuration menu - View commit details
-
Copy full SHA for 39c3c0a - Browse repository at this point
Copy the full SHA 39c3c0aView commit details -
Configuration menu - View commit details
-
Copy full SHA for 5f81266 - Browse repository at this point
Copy the full SHA 5f81266View commit details
Commits on Jan 24, 2024
-
Exclude the load balancing loss of padding tokens in Mixtral-8x7B (#…
…28517) * fix the function load_balancing_loss_func in Mixtral_Moe to include attention_mask * format code using black and ruff * skip computing mask if attention_mask=None * add tests for load balancing loss Mixtral-Moe * fix assert loss is different in mixtral_test * fix pad_leng * use assertNotAlmostEqual and print to debug * remove print for debug * minor updates * reduce rtol and atol
Configuration menu - View commit details
-
Copy full SHA for c5c6909 - Browse repository at this point
Copy the full SHA c5c6909View commit details -
Use save_safetensor to disable safe serialization for XLA (#28669)
* Use save_safetensor to disable safe serialization for XLA #28438 * Style fixup
Configuration menu - View commit details
-
Copy full SHA for 0549000 - Browse repository at this point
Copy the full SHA 0549000View commit details -
Configuration menu - View commit details
-
Copy full SHA for bb6aa8b - Browse repository at this point
Copy the full SHA bb6aa8bView commit details -
* config * optim * pre deploy * deploy * save weights, memory, troubleshoot, non-Trainer * done
Configuration menu - View commit details
-
Copy full SHA for 738ec75 - Browse repository at this point
Copy the full SHA 738ec75View commit details -
Improved type hinting for all attention parameters (#28479)
* Changed type hinting for all attention inputs to 'Optional[Tuple[torch.FloatTensor,...]] = None' * Fixed the ruff formatting issue * fixed type hinting for all hidden_states to 'Optional[Tuple[torch.FloatTensor, ...]] = None' * Changed type hinting in these 12 scripts modeling_dpr.py,modeling_nat.py,idefics/vision.py,modeling_tf_dpr.py,modeling_luke.py,modeling_swin.py,modeling_tf_swin.py,modeling_blip.py,modeling_tf_blip.py,modeling_donut_swin.py,modeling_dinat.py,modeling_swinv2.py * test fail update * fixed type hinting for these 15 scripts modeling_xlnet.py,modeling_tf_xlnet.py,modeling_led.py,modeling_tf_led.py,modleing_rwkv.py,modeling_dpt.py,modeling_tf_cvt.py,modeling_clip.py,modeling_flax_clip.py,modeling_tf_clip.py,modeling_longformer.py,modeling_tf_longformer.py,modeling_siglip.py,modeling_clap.py,modeling_git.py * Changed type hinting in these 12 scripts modeling_dpr.py,modeling_nat.py,idefics/vision.py,modeling_tf_dpr.py,modeling_luke.py,modeling_swin.py,modeling_tf_swin.py,modeling_blip.py,modeling_tf_blip.py,modeling_donut_swin.py,modeling_dinat.py,modeling_swinv2.py * test fail update * Removed the myvenv file * Fixed type hinting for these 8 scripts modeling_tvlt.py,modeling_sam.py,modeling_tf_sam.py,modeling_tvp.py,modeling_rag.py,modeling_tf_rag.py,modeling_tf_xlm.py,modeling_xlm.py
Configuration menu - View commit details
-
Copy full SHA for 5d29530 - Browse repository at this point
Copy the full SHA 5d29530View commit details -
improve efficient training on CPU documentation (#28646)
* update doc * revert * typo fix * refine * add dtypes * Update docs/source/en/perf_train_cpu.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/perf_train_cpu.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/perf_train_cpu.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * no comma * use avx512-vnni --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Configuration menu - View commit details
-
Copy full SHA for 8278b15 - Browse repository at this point
Copy the full SHA 8278b15View commit details -
[docs] Fix doc format (#28684)
* fix hfoptions * revert changes to other files * fix
Configuration menu - View commit details
-
Copy full SHA for f40b87d - Browse repository at this point
Copy the full SHA f40b87dView commit details
Commits on Jan 25, 2024
-
* First draft * More improvements * More improvements * More improvements * More improvements * Add docs * Remove file * Add copied from * Address comments * Address comments * Address comments * Fix style * Update docs * Convert all checkpoints, add integration test * Rename checkpoints * Add pretrained backbone attributes * Fix default config * Address comment * Add figure to docs * Fix bug thanks to @xenova * Update conversion script * Fix integration test
Configuration menu - View commit details
-
Copy full SHA for 963db81 - Browse repository at this point
Copy the full SHA 963db81View commit details -
[
chore
] Add missing space in warning (#28695)Add missing space in warning
Configuration menu - View commit details
-
Copy full SHA for 7fa4b36 - Browse repository at this point
Copy the full SHA 7fa4b36View commit details -
Configuration menu - View commit details
-
Copy full SHA for 2000095 - Browse repository at this point
Copy the full SHA 2000095View commit details -
Update question_answering.md (#28694)
fix typo: from: "model = TFAutoModelForQuestionAnswering("distilbert-base-uncased")" to: model = TFAutoModelForQuestionAnswering.from_pretrained("distilbert-base-uncased")
Configuration menu - View commit details
-
Copy full SHA for 24f1a00 - Browse repository at this point
Copy the full SHA 24f1a00View commit details -
[
Vilt
] align input and model dtype in the ViltPatchEmbeddings forwa……rd pass (#28633) align dtype
Configuration menu - View commit details
-
Copy full SHA for 4cbd876 - Browse repository at this point
Copy the full SHA 4cbd876View commit details -
[
docs
] Improve visualization for vertical parallelism (#28583)The documentation says "We refer to this Model parallelism as “Vertical” because of how models are typically visualized.", but then visualizes the model horizontally. This change visualizes the model indeed vertically.
Configuration menu - View commit details
-
Copy full SHA for 2875195 - Browse repository at this point
Copy the full SHA 2875195View commit details
Commits on Jan 26, 2024
-
Don't fail when
LocalEntryNotFoundError
during `processor_config.js……on` loading (#28709) * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Configuration menu - View commit details
-
Copy full SHA for 142ce68 - Browse repository at this point
Copy the full SHA 142ce68View commit details -
Fix duplicate & unnecessary flash attention warnings (#28557)
* fix duplicate & unnecessary flash warnings * trigger ci * warning_once * if/else order --------- Co-authored-by: Your Name <you@example.com>
Configuration menu - View commit details
-
Copy full SHA for 8eb74c1 - Browse repository at this point
Copy the full SHA 8eb74c1View commit details -
support PeftMixedModel signature inspect (#28321)
* support PeftMixedModel signature inspect * import PeftMixedModel only peft>=0.7.0 * Update src/transformers/trainer.py Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * Update src/transformers/trainer.py Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * Update src/transformers/trainer.py Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * Update src/transformers/trainer.py Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * Update src/transformers/trainer.py Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * Update src/transformers/trainer.py Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * fix styling * Update src/transformers/trainer.py Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * Update src/transformers/trainer.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * style fixup * fix note --------- Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
Configuration menu - View commit details
-
Copy full SHA for bbe30c6 - Browse repository at this point
Copy the full SHA bbe30c6View commit details -
Configuration menu - View commit details
-
Copy full SHA for 1f47a24 - Browse repository at this point
Copy the full SHA 1f47a24View commit details -
[
docs
] Update preprocessing.md (#28719)* Update preprocessing.md adjust ImageProcessor link to working target (same as in lower section of file) * Update preprocessing.md
Configuration menu - View commit details
-
Copy full SHA for 3a46e30 - Browse repository at this point
Copy the full SHA 3a46e30View commit details -
Initialize _tqdm_active with hf_hub_utils.are_progress_bars_disabled(… (
#28717) Initialize _tqdm_active with hf_hub_utils.are_progress_bars_disabled() to respect HF_HUB_DISABLE_PROGRESS_BARS It seems like enable_progress_bar() and disable_progress_bar() sync up with huggingface_hub, but the initial value is always True. This changes will make sure the user's preference is respected implicity on initialization.
Configuration menu - View commit details
-
Copy full SHA for d6ac8f4 - Browse repository at this point
Copy the full SHA d6ac8f4View commit details -
fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Configuration menu - View commit details
-
Copy full SHA for a638de1 - Browse repository at this point
Copy the full SHA a638de1View commit details -
Stop confusing the TF compiler with ModelOutput objects (#28712)
* Stop confusing the TF compiler with ModelOutput objects * Stop confusing the TF compiler with ModelOutput objects
Configuration menu - View commit details
-
Copy full SHA for 708b19e - Browse repository at this point
Copy the full SHA 708b19eView commit details -
fix: suppress
GatedRepoError
to use cache file (fix #28558). (#28566)* fix: suppress `GatedRepoError` to use cache file (fix #28558). * move condition_to_return parameter back to outside.
Configuration menu - View commit details
-
Copy full SHA for 3aea38c - Browse repository at this point
Copy the full SHA 3aea38cView commit details -
* try pydantic v2 * try pydantic v2 --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Configuration menu - View commit details
-
Copy full SHA for f8b7c43 - Browse repository at this point
Copy the full SHA f8b7c43View commit details -
Configuration menu - View commit details
-
Copy full SHA for abe0289 - Browse repository at this point
Copy the full SHA abe0289View commit details -
Configuration menu - View commit details
-
Copy full SHA for de13a95 - Browse repository at this point
Copy the full SHA de13a95View commit details
Commits on Jan 27, 2024
-
Configuration menu - View commit details
-
Copy full SHA for a28a769 - Browse repository at this point
Copy the full SHA a28a769View commit details -
Configuration menu - View commit details
-
Copy full SHA for 03cc177 - Browse repository at this point
Copy the full SHA 03cc177View commit details
Commits on Jan 28, 2024
-
[
Siglip
] protect from imports if sentencepiece not installed (#28737)[Siglip] protect from imports if sentencepiece not installed
Configuration menu - View commit details
-
Copy full SHA for f1cc615 - Browse repository at this point
Copy the full SHA f1cc615View commit details
Commits on Jan 29, 2024
-
Add serialization logic to pytree types (#27871)
* Add serialized type name to pytrees * Modify context * add serde test
Configuration menu - View commit details
-
Copy full SHA for 243e186 - Browse repository at this point
Copy the full SHA 243e186View commit details -
Fix
DepthEstimationPipeline
's docstring (#28733)* fix * fix * Fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Configuration menu - View commit details
-
Copy full SHA for 5649c0c - Browse repository at this point
Copy the full SHA 5649c0cView commit details -
Configuration menu - View commit details
-
Copy full SHA for 39fa400 - Browse repository at this point
Copy the full SHA 39fa400View commit details -
[Docs] Fix Typo in English & Japanese CLIP Model Documentation (TMBD …
…-> TMDB) (#28751) * [Docs] Fix Typo in English CLIP model_doc * [Docs] Fix Typo in Japanese CLIP model_doc
Configuration menu - View commit details
-
Copy full SHA for 3a08cc4 - Browse repository at this point
Copy the full SHA 3a08cc4View commit details -
PatchtTST and PatchTSMixer fixes (#28083)
* 🐛 fix .max bug * remove prediction_length from regression output dimensions * fix parameter names, fix output names, update tests * ensure shape for PatchTST * ensure output shape for PatchTSMixer * update model, batch, and expected for regression distribution test * update test expected Signed-off-by: Wesley M. Gifford <wmgifford@us.ibm.com> * Update tests/models/patchtst/test_modeling_patchtst.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update tests/models/patchtst/test_modeling_patchtst.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update tests/models/patchtst/test_modeling_patchtst.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/patchtsmixer/modeling_patchtsmixer.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update tests/models/patchtsmixer/test_modeling_patchtsmixer.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update tests/models/patchtsmixer/test_modeling_patchtsmixer.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * standardize on patch_length Signed-off-by: Wesley M. Gifford <wmgifford@us.ibm.com> * Update tests/models/patchtsmixer/test_modeling_patchtsmixer.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update tests/models/patchtsmixer/test_modeling_patchtsmixer.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Make arguments more explicit Signed-off-by: Wesley M. Gifford <wmgifford@us.ibm.com> * adjust prepared inputs Signed-off-by: Wesley M. Gifford <wmgifford@us.ibm.com> --------- Signed-off-by: Wesley M. Gifford <wmgifford@us.ibm.com> Co-authored-by: Wesley M. Gifford <wmgifford@us.ibm.com> Co-authored-by: Kashif Rasul <kashif.rasul@gmail.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
Configuration menu - View commit details
-
Copy full SHA for f72c7c2 - Browse repository at this point
Copy the full SHA f72c7c2View commit details -
Enable Gradient Checkpointing in Deformable DETR (#28686)
* Enabled gradient checkpointing in Deformable DETR * Enabled gradient checkpointing in Deformable DETR encoder * Removed # Copied from headers in modeling_deta.py to break dependence on Deformable DETR code
Configuration menu - View commit details
-
Copy full SHA for 0548af5 - Browse repository at this point
Copy the full SHA 0548af5View commit details -
Configuration menu - View commit details
-
Copy full SHA for 26aa03a - Browse repository at this point
Copy the full SHA 26aa03aView commit details -
Pin pytest version <8.0.0 (#28758)
* Pin pytest version <8.0.0 * Update setup.py * make deps_table_update
Configuration menu - View commit details
-
Copy full SHA for 0f8d015 - Browse repository at this point
Copy the full SHA 0f8d015View commit details -
Mark test_constrained_beam_search_generate as flaky (#28757)
* Make test_constrained_beam_search_generate as flaky * Update tests/generation/test_utils.py
Configuration menu - View commit details
-
Copy full SHA for 9e8f35f - Browse repository at this point
Copy the full SHA 9e8f35fView commit details -
Configuration menu - View commit details
-
Copy full SHA for e694e98 - Browse repository at this point
Copy the full SHA e694e98View commit details -
[Whisper] Make tokenizer normalization public (#28136)
* [Whisper] Make tokenizer normalization public * add to docs
Configuration menu - View commit details
-
Copy full SHA for da3c79b - Browse repository at this point
Copy the full SHA da3c79bView commit details -
Support saving only PEFT adapter in checkpoints when using PEFT + FSDP (
#28297) * Update trainer.py * Revert "Update trainer.py" This reverts commit 0557e2c. * Make trainer.py use adapter_only=True when using FSDP + PEFT * Support load_best_model with adapter_only=True * Ruff format * Inspect function args for save_ load_ fsdp utility functions and only pass adapter_only=True if they support it
Configuration menu - View commit details
-
Copy full SHA for a055d09 - Browse repository at this point
Copy the full SHA a055d09View commit details -
Add French translation: french README.md (#28696)
* doc: french README Signed-off-by: ThibaultLengagne <thibaultl@padok.fr> * doc: Add Depth Anything Signed-off-by: ThibaultLengagne <thibaultl@padok.fr> * doc: Add french link in other docs Signed-off-by: ThibaultLengagne <thibaultl@padok.fr> * doc: Add missing links in fr docs * doc: fix several mistakes in translation Signed-off-by: ThibaultLengagne <thibaultl@padok.fr> --------- Signed-off-by: ThibaultLengagne <thibaultl@padok.fr> Co-authored-by: Sarapuce <alexandreh@padok.fr>
Configuration menu - View commit details
-
Copy full SHA for cd2eb8c - Browse repository at this point
Copy the full SHA cd2eb8cView commit details
Commits on Jan 30, 2024
-
Don't allow passing
load_in_8bit
andload_in_4bit
at the same time (#28266) * Update quantization_config.py * Style * Protect from setting directly * add tests * Update tests/quantization/bnb/test_4bit.py Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> --------- Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
Configuration menu - View commit details
-
Copy full SHA for a989c6c - Browse repository at this point
Copy the full SHA a989c6cView commit details -
Move CLIP _no_split_modules to CLIPPreTrainedModel (#27841)
Add _no_split_modules to CLIPModel
Configuration menu - View commit details
-
Copy full SHA for 1f5590d - Browse repository at this point
Copy the full SHA 1f5590dView commit details -
HfQuantizer
class for quantization-related stuff in `modeling_utils…….py` (#26610) * squashed earlier commits for easier rebase * rm rebase leftovers * 4bit save enabled @quantizers * TMP gptq test use exllama * fix AwqConfigTest::test_wrong_backend for A100 * quantizers AWQ fixes * _load_pretrained_model low_cpu_mem_usage branch * quantizers style * remove require_low_cpu_mem_usage attr * rm dtype arg from process_model_before_weight_loading * rm config_origin from Q-config * rm inspect from q_config * fixed docstrings in QuantizationConfigParser * logger.warning fix * mv is_loaded_in_4(8)bit to BnbHFQuantizer * is_accelerate_available error msg fix in quantizer * split is_model_trainable in bnb quantizer class * rm llm_int8_skip_modules as separate var in Q * Q rm todo * fwd ref to HFQuantizer in type hint * rm note re optimum.gptq.GPTQQuantizer * quantization_config in __init__ simplified * replaced NonImplemented with create_quantized_param * rm load_in_4/8_bit deprecation warning * QuantizationConfigParser refactoring * awq-related minor changes * awq-related changes * awq config.modules_to_not_convert * raise error if no q-method in q-config in args * minor cleanup * awq quantizer docstring * combine common parts in bnb process_model_before_weight_loading * revert test_gptq * .process_model_ cleanup * restore dict config warning * removed typevars in quantizers.py * cleanup post-rebase 16 jan * QuantizationConfigParser classmethod refactor * rework of handling of unexpected aux elements of bnb weights * moved q-related stuff from save_pretrained to quantizers * refactor v1 * more changes * fix some tests * remove it from main init * ooops * Apply suggestions from code review Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> * fix awq issues * fix * fix * fix * fix * fix * fix * add docs * Apply suggestions from code review Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update docs/source/en/hf_quantizer.md * address comments * fix * fixup * Update src/transformers/modeling_utils.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/modeling_utils.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * address final comment * update * Update src/transformers/quantizers/base.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/quantizers/auto.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * fix * add kwargs update * fixup * add `optimum_quantizer` attribute * oops * rm unneeded file * fix doctests --------- Co-authored-by: younesbelkada <younesbelkada@gmail.com> Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
Configuration menu - View commit details
-
Copy full SHA for d78e78a - Browse repository at this point
Copy the full SHA d78e78aView commit details -
[
HfQuantizer
] Move it to "Developper guides" (#28768)Update _toctree.yml
Configuration menu - View commit details
-
Copy full SHA for 866253f - Browse repository at this point
Copy the full SHA 866253fView commit details -
* use conv for tdnn * run make fixup * update TDNN * add PEFT LoRA check * propagate tdnn warnings to others * add missing imports * update TDNN in wav2vec2_bert * add missing imports
Configuration menu - View commit details
-
Copy full SHA for 5c8d941 - Browse repository at this point
Copy the full SHA 5c8d941View commit details -
Fix transformers.utils.fx compatibility with torch<2.0 (#28774)
guard sdpa on torch>=2.0
Configuration menu - View commit details
-
Copy full SHA for 6f7d5db - Browse repository at this point
Copy the full SHA 6f7d5dbView commit details -
Further pin pytest version (in a temporary way) (#28780)
fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Configuration menu - View commit details
-
Copy full SHA for c24c524 - Browse repository at this point
Copy the full SHA c24c524View commit details -
[
Backbone
] Useload_backbone
instead ofAutoBackbone.from_config
(#28661) * Enable instantiating model with pretrained backbone weights * Remove doc updates until changes made in modeling code * Use load_backbone instead * Add use_timm_backbone to the model configs * Add missing imports and arguments * Update docstrings * Make sure test is properly configured * Include recent DPT updates
Configuration menu - View commit details
-
Copy full SHA for 2fa1c80 - Browse repository at this point
Copy the full SHA 2fa1c80View commit details -
Task-specific pipeline init args (#28439)
* Abstract out pipeline init args * Address PR comments * Reword * BC PIPELINE_INIT_ARGS * Remove old arguments * Small fix
Configuration menu - View commit details
-
Copy full SHA for 1d489b3 - Browse repository at this point
Copy the full SHA 1d489b3View commit details -
Add tf_keras imports to prepare for Keras 3 (#28588)
* Port core files + ESM (because ESM code is odd) * Search-replace in modelling code * Fix up transfo_xl as well * Fix other core files + tests (still need to add correct import to tests) * Fix cookiecutter * make fixup, fix imports in some more core files * Auto-add imports to tests * Cleanup, add imports to sagemaker tests * Use correct exception for importing tf_keras * Fixes in modeling_tf_utils * make fixup * Correct version parsing code * Ensure the pipeline tests correctly revert to float32 after each test * Ensure the pipeline tests correctly revert to float32 after each test * More tf.keras -> keras * Add dtype cast * Better imports of tf_keras * Add a cast for tf.assign, just in case * Fix callback imports
Configuration menu - View commit details
-
Copy full SHA for 415e9a0 - Browse repository at this point
Copy the full SHA 415e9a0View commit details -
* Pin torch to <2.2.0 * Pin torchvision and torchaudio as well * Playing around with versions to see if this helps * twiddle something to restart the CI * twiddle it back * Try changing the natten version * make fixup * Revert "Try changing the natten version" This reverts commit de0d659. * make fixup * fix fix fix * fix fix fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Configuration menu - View commit details
-
Copy full SHA for 74c9cfe - Browse repository at this point
Copy the full SHA 74c9cfeView commit details
Commits on Jan 31, 2024
-
Configuration menu - View commit details
-
Copy full SHA for d703eaa - Browse repository at this point
Copy the full SHA d703eaaView commit details -
Prevent MLflow exception from disrupting training (#28779)
Modified MLflow logging metrics from synchronous to asynchronous Co-authored-by: codiceSpaghetti <alessio.ser@hotmail.it>
Configuration menu - View commit details
-
Copy full SHA for a937425 - Browse repository at this point
Copy the full SHA a937425View commit details -
don't initialize the output embeddings if we're going to tie them to …
…input embeddings (#28192) * test that tied output embeddings aren't initialized on load * don't initialize the output embeddings if we're going to tie them to the input embeddings
Configuration menu - View commit details
-
Copy full SHA for ae0c27a - Browse repository at this point
Copy the full SHA ae0c27aView commit details -
[
HFQuantizer
] Removecheck_packages_compatibility
logic (#28789)remove `check_packages_compatibility` logic
Configuration menu - View commit details
-
Copy full SHA for f9f1f2a - Browse repository at this point
Copy the full SHA f9f1f2aView commit details -
[Whisper] Refactor forced_decoder_ids & prompt ids (#28687)
* up * Fix more * Correct more * Fix more tests * fix fast tests * Fix more * fix more * push all files * finish all * make style * Fix timestamp wrap * make style * make style * up * up * up * Fix lang detection behavior * Fix lang detection behavior * Add lang detection test * Fix lang detection behavior * make style * Update src/transformers/models/whisper/generation_whisper.py Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> * better error message * make style tests * add warning --------- Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
Configuration menu - View commit details
-
Copy full SHA for 65a926e - Browse repository at this point
Copy the full SHA 65a926eView commit details -
Resolve DeepSpeed cannot resume training with PeftModel (#28746)
* fix: resolve deepspeed resume peft model issues * chore: update something * chore: update model instance pass into is peft model checks * chore: remove hard code value to tests * fix: format code
Configuration menu - View commit details
-
Copy full SHA for bebeeee - Browse repository at this point
Copy the full SHA bebeeeeView commit details -
canonical repos moves (#28795)
* canonical repos moves * Style --------- Co-authored-by: Lysandre <lysandre@huggingface.co>
Configuration menu - View commit details
-
Copy full SHA for 721e2d9 - Browse repository at this point
Copy the full SHA 721e2d9View commit details -
Wrap Keras methods to support BatchEncoding (#28734)
* Shim the Keras methods to support BatchEncoding * Extract everything to a convert_batch_encoding function * Convert BatchFeature too (thanks Amy) * tf.keras -> keras
Configuration menu - View commit details
-
Copy full SHA for 7a49610 - Browse repository at this point
Copy the full SHA 7a49610View commit details -
* direct copy from llama work * mistral modules forward pass working * flax mistral forward pass with sliding window * added tests * added layer collection approach * Revert "added layer collection approach" This reverts commit 0e2905b. * Revert "Revert "added layer collection approach"" This reverts commit fb17b61. * fixed attention outputs * added mistral to init and auto * fixed import name * fixed layernorm weight dtype * freeze initialized weights * make sure conversion consideres bfloat16 * added backend * added docstrings * added cache * fixed sliding window causal mask * passes cache tests * passed all tests * applied make style * removed commented out code * applied fix-copies ignored other model changes * applied make fix-copies * removed unused functions * passed generation integration test * slow tests pass * fixed slow tests * changed default dtype from jax.numpy.float32 to float32 for docstring check * skip cache test for FlaxMistralForSequenceClassification since if pad_token_id in input_ids it doesn't score previous input_ids * updated checkpoint since from_pt not included * applied black style * removed unused args * Applied styling and fixup * changed checkpoint for doc back * fixed rf after adding it to hf hub * Add dummy ckpt * applied styling * added tokenizer to new ckpt * fixed slice format * fix init and slice * changed ref for placeholder TODO * added copies from Llama * applied styling * applied fix-copies * fixed docs * update weight dtype reconversion for sharded weights * removed Nullable input ids * Removed unnecessary output attentions in Module * added embedding weight initialziation * removed unused past_key_values * fixed deterministic * Fixed RMS Norm and added copied from * removed input_embeds * applied make style * removed nullable input ids from sequence classification model * added copied from GPTJ * added copied from Llama on FlaxMistralDecoderLayer * added copied from to FlaxMistralPreTrainedModel methods * fix test deprecation warning * freeze gpt neox random_params and fix copies * applied make style * fixed doc issue * skipped docstring test to allign # copied from * applied make style * removed FlaxMistralForSequenceClassification * removed unused padding_idx * removed more sequence classification * removed sequence classification * applied styling and consistency * added copied from in tests * removed sequence classification test logic * applied styling * applied make style * removed freeze and fixed copies * undo test change * changed repeat_kv to tile * fixed to key value groups * updated copyright year * split casual_mask * empty to rerun failed pt_flax_equivalence test FlaxWav2Vec2ModelTest * went back to 2023 for tests_pr_documentation_tests * went back to 2024 * changed tile to repeat * applied make style * empty for retry on Wav2Vec2
Configuration menu - View commit details
-
Copy full SHA for f7076cd - Browse repository at this point
Copy the full SHA f7076cdView commit details -
DeepSpeed: hardcode
torch.arange
dtype onfloat
usage to avoid in……correct initialization (#28760)
Configuration menu - View commit details
-
Copy full SHA for beb2a09 - Browse repository at this point
Copy the full SHA beb2a09View commit details -
Add artifact name in job step to maintain job / artifact corresponden…
…ce (#28682) * avoid using job name * apply to other files --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Configuration menu - View commit details
-
Copy full SHA for 95346e9 - Browse repository at this point
Copy the full SHA 95346e9View commit details -
Split daily CI using 2 level matrix (#28773)
* update / add new workflow files * Add comment * Use env.NUM_SLICES * use scripts * use scripts * use scripts * Fix * using one script * Fix * remove unused file * update * fail-fast: false * remove unused file * fix * fix * use matrix * inputs * style * update * fix * fix * no model name * add doc * allow args * style * pass argument --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Configuration menu - View commit details
-
Copy full SHA for 4735866 - Browse repository at this point
Copy the full SHA 4735866View commit details -
[docs] Correct the statement in the docstirng of compute_transition_s…
…cores in generation/utils.py (#28786)
Configuration menu - View commit details
-
Copy full SHA for 7b2bd1f - Browse repository at this point
Copy the full SHA 7b2bd1fView commit details
Commits on Feb 1, 2024
-
Adding [T5/MT5/UMT5]ForTokenClassification (#28443)
* Adding [T5/MT5/UMT5]ForTokenClassification * Add auto mappings for T5ForTokenClassification and variants * Adding ForTokenClassification to the list of models * Adding attention_mask param to the T5ForTokenClassification test * Remove outdated comment in test * Adding EncoderOnly and Token Classification tests for MT5 and UMT5 * Fix typo in umt5 string * Add tests for all the existing MT5 models * Fix wrong comment in dependency_versions_table * Reverting change to common test for _keys_to_ignore_on_load_missing The test is correctly picking up redundant keys in _keys_to_ignore_on_load_missing. * Removing _keys_to_ignore_on_missing from MT5 since the key is not used in the model * Add fix-copies to MT5ModelTest
Configuration menu - View commit details
-
Copy full SHA for 0d26abd - Browse repository at this point
Copy the full SHA 0d26abdView commit details -
Make
is_torch_bf16_available_on_device
more strict (#28796)fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Configuration menu - View commit details
-
Copy full SHA for eb8e7a0 - Browse repository at this point
Copy the full SHA eb8e7a0View commit details -
Fix symbolic_trace with kv cache (#28724)
* fix symbolic_trace with kv cache * comment & better test
Configuration menu - View commit details
-
Copy full SHA for 709dc43 - Browse repository at this point
Copy the full SHA 709dc43View commit details -
Add tip on setting tokenizer attributes (#28764)
* Add tip on setting tokenizer attributes * Grammar * Remove the bit that was causing doc builds to fail
Configuration menu - View commit details
-
Copy full SHA for 7bc6d76 - Browse repository at this point
Copy the full SHA 7bc6d76View commit details -
enable graident checkpointing in DetaObjectDetection and add tests in…
… Swin/Donut_Swin (#28615) * enable graident checkpointing in DetaObjectDetection * fix missing part in original DETA * make style * make fix-copies * Revert "make fix-copies" This reverts commit 4041c86. * remove fix-copies of DetaDecoder * enable swin gradient checkpointing * fix gradient checkpointing in donut_swin * add tests for deta/swin/donut * Revert "fix gradient checkpointing in donut_swin" This reverts commit 1cf345e. * change supports_gradient_checkpointing pipeline to PreTrainedModel * Revert "add tests for deta/swin/donut" This reverts commit 6056ffb. * Revert "Revert "fix gradient checkpointing in donut_swin"" This reverts commit 24e25d0. * Simple revert * enable deformable detr gradient checkpointing * add gradient in encoder
Configuration menu - View commit details
-
Copy full SHA for e19c12e - Browse repository at this point
Copy the full SHA e19c12eView commit details -
[docs] fix some bugs about parameter description (#28806)
Co-authored-by: p_spozzhang <p_spozzhang@tencent.com>
Configuration menu - View commit details
-
Copy full SHA for d98591a - Browse repository at this point
Copy the full SHA d98591aView commit details -
* Add modelss * Add 2 more models * add models to tocrree * Add modles * Update docs/source/ja/model_doc/detr.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/ja/model_doc/deit.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/ja/model_doc/deplot.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * fix bugs --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Configuration menu - View commit details
-
Copy full SHA for 23ea674 - Browse repository at this point
Copy the full SHA 23ea674View commit details -
* backbones * fix path * fix paths * fix code snippet * fix links
Configuration menu - View commit details
-
Copy full SHA for abbffc4 - Browse repository at this point
Copy the full SHA abbffc4View commit details
Commits on Feb 2, 2024
-
Configuration menu - View commit details
-
Copy full SHA for 2418c64 - Browse repository at this point
Copy the full SHA 2418c64View commit details -
[Docs] Fix spelling and grammar mistakes (#28825)
* Fix typos and grammar mistakes in docs and examples * Fix typos in docstrings and comments * Fix spelling of `tokenizer` in model tests * Remove erroneous spaces in decorators * Remove extra spaces in Markdown link texts
Configuration menu - View commit details
-
Copy full SHA for 721ee78 - Browse repository at this point
Copy the full SHA 721ee78View commit details -
Explicitly check if token ID's are None in TFBertTokenizer constructor (
#28824) Add an explicit none-check, since token ids can be 0
Configuration menu - View commit details
-
Copy full SHA for 1efb21c - Browse repository at this point
Copy the full SHA 1efb21cView commit details -
Add missing None check for hf_quantizer (#28804)
* Add missing None check for hf_quantizer * Add test, fix logic. * make style * Switch test model to Mistral * Comment * Update tests/test_modeling_utils.py --------- Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
Configuration menu - View commit details
-
Copy full SHA for ec29d25 - Browse repository at this point
Copy the full SHA ec29d25View commit details -
Fix issues caused by natten (#28834)
try Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Configuration menu - View commit details
-
Copy full SHA for 0e75aee - Browse repository at this point
Copy the full SHA 0e75aeeView commit details -
fix / skip (for now) some tests before switch to torch 2.2 (#28838)
* fix / skip some tests before we can switch to torch 2.2 * style --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Configuration menu - View commit details
-
Copy full SHA for a7cb92a - Browse repository at this point
Copy the full SHA a7cb92aView commit details -
Use
-v
forpytest
on CircleCI (#28840)use -v in pytest Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Configuration menu - View commit details
-
Copy full SHA for f497795 - Browse repository at this point
Copy the full SHA f497795View commit details -
Reduce GPU memory usage when using FSDP+PEFT (#28830)
support FSDP+PEFT
Configuration menu - View commit details
-
Copy full SHA for 80d5007 - Browse repository at this point
Copy the full SHA 80d5007View commit details -
Mark
test_encoder_decoder_model_generate
for `vision_encoder_deocde……r` as flaky (#28842) Mark test as flaky
Configuration menu - View commit details
-
Copy full SHA for 3d2900e - Browse repository at this point
Copy the full SHA 3d2900eView commit details
Commits on Feb 5, 2024
-
Bump dash from 2.3.0 to 2.15.0 in /examples/research_projects/decisio…
…n_transformer (#28845) Bump dash in /examples/research_projects/decision_transformer Bumps [dash](https://github.com/plotly/dash) from 2.3.0 to 2.15.0. - [Release notes](https://github.com/plotly/dash/releases) - [Changelog](https://github.com/plotly/dash/blob/dev/CHANGELOG.md) - [Commits](plotly/dash@v2.3.0...v2.15.0) --- updated-dependencies: - dependency-name: dash dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Configuration menu - View commit details
-
Copy full SHA for ca8944c - Browse repository at this point
Copy the full SHA ca8944cView commit details -
Support custom scheduler in deepspeed training (#26831)
Reuse trainer.create_scheduler to create scheduler for deepspeed
Configuration menu - View commit details
-
Copy full SHA for 7b70283 - Browse repository at this point
Copy the full SHA 7b70283View commit details -
[Docs] Fix bad doc: replace save with logging (#28855)
Fix bad doc: replace save with logging
Configuration menu - View commit details
-
Copy full SHA for c430d6e - Browse repository at this point
Copy the full SHA c430d6eView commit details -
Ability to override clean_code_for_run (#28783)
* Add clean_code_for_run function * Call clean_code_for_run from agent method
Configuration menu - View commit details
-
Copy full SHA for 0466fd5 - Browse repository at this point
Copy the full SHA 0466fd5View commit details -
[WIP] Hard error when ignoring tensors. (#27484)
* [WIP] Hard error when ignoring tensors. * Better selection/error when saving a checkpoint. - Find all names we should normally drop (those are in the transformers config) - Find all disjoint tensors (for those we can safely trigger a copy to get rid of the sharing before saving) - Clone those disjoint tensors getting rid of the issue - Find all identical names (those should be declared in the config but we try to find them all anyway.) - For all identical names: - If they are in the config, just ignore them everything is fine - If they are not, warn about them. - For all remainder tensors which are shared yet neither identical NOR disjoint. raise a hard error. * Adding a failing test on `main` that passes here. * We don't need to keep the subfolder logic in this test. * Apply suggestions from code review Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
Configuration menu - View commit details
-
Copy full SHA for 2da28c4 - Browse repository at this point
Copy the full SHA 2da28c4View commit details -
Configuration menu - View commit details
-
Copy full SHA for 3f9f749 - Browse repository at this point
Copy the full SHA 3f9f749View commit details -
Correct wav2vec2-bert inputs_to_logits_ratio (#28821)
* Correct wav2vec2-bert inputs_to_logits_ratio * correct ratio * correct ratio, clean asr pipeline * refactor on one line
Configuration menu - View commit details
-
Copy full SHA for 7addc93 - Browse repository at this point
Copy the full SHA 7addc93View commit details -
Image Feature Extraction pipeline (#28216)
* Draft pipeline * Fixup * Fix docstrings * Update doctest * Update pipeline_model_mapping * Update docstring * Update tests * Update src/transformers/pipelines/image_feature_extraction.py Co-authored-by: Omar Sanseviero <osanseviero@gmail.com> * Fix docstrings - review comments * Remove pipeline mapping for composite vision models * Add to pipeline tests * Remove for flava (multimodal) * safe pil import * Add requirements for pipeline run * Account for super slow efficientnet * Review comments * Fix tests * Swap order of kwargs * Use build_pipeline_init_args * Add back FE pipeline for Vilt * Include image_processor_kwargs in docstring * Mark test as flaky * Update TODO * Update tests/pipelines/test_pipelines_image_feature_extraction.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Add license header --------- Co-authored-by: Omar Sanseviero <osanseviero@gmail.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
Configuration menu - View commit details
-
Copy full SHA for ba3264b - Browse repository at this point
Copy the full SHA ba3264bView commit details -
ClearMLCallback enhancements: support multiple runs and handle loggin…
…g better (#28559) * add clearml tracker * support multiple train runs * remove bad code * add UI entries for config/hparams overrides * handle models in different tasks * run ruff format * tidy code based on code review --------- Co-authored-by: Eugen Ajechiloae <eugenajechiloae@gmail.com>
Configuration menu - View commit details
-
Copy full SHA for 0690116 - Browse repository at this point
Copy the full SHA 0690116View commit details
Commits on Feb 6, 2024
-
Configuration menu - View commit details
-
Copy full SHA for ac51e59 - Browse repository at this point
Copy the full SHA ac51e59View commit details -
Adds LlamaForQuestionAnswering class in modeling_llama.py along with …
…AutoModel Support (#28777) * This is a test commit * testing commit * final commit with some changes * Removed copy statement * Fixed formatting issues * Fixed error added past_key_values in the forward method * Fixed a trailing whitespace. Damn the formatting rules are strict * Added the copy statement
Configuration menu - View commit details
-
Copy full SHA for 2e7c942 - Browse repository at this point
Copy the full SHA 2e7c942View commit details -
Bump cryptography from 41.0.2 to 42.0.0 in /examples/research_project…
…s/decision_transformer (#28879) Bump cryptography in /examples/research_projects/decision_transformer Bumps [cryptography](https://github.com/pyca/cryptography) from 41.0.2 to 42.0.0. - [Changelog](https://github.com/pyca/cryptography/blob/main/CHANGELOG.rst) - [Commits](pyca/cryptography@41.0.2...42.0.0) --- updated-dependencies: - dependency-name: cryptography dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Configuration menu - View commit details
-
Copy full SHA for e83227d - Browse repository at this point
Copy the full SHA e83227dView commit details -
[Docs] Update project names and links in awesome-transformers (#28878)
Update project names and repository links in awesome-transformers
Configuration menu - View commit details
-
Copy full SHA for 1ea0bbd - Browse repository at this point
Copy the full SHA 1ea0bbdView commit details -
Configuration menu - View commit details
-
Copy full SHA for ee2a340 - Browse repository at this point
Copy the full SHA ee2a340View commit details -
Raise error when using
save_only_model
with `load_best_model_at_end……` for DeepSpeed/FSDP (#28866) * Raise error when using `save_only_model` with `load_best_model_at_end` for DeepSpeed/FSDP * Update trainer.py
Configuration menu - View commit details
-
Copy full SHA for 5346db1 - Browse repository at this point
Copy the full SHA 5346db1View commit details -
Fix
FastSpeech2ConformerModelTest
and skip it on CPU (#28888)* fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Configuration menu - View commit details
-
Copy full SHA for 6529a5b - Browse repository at this point
Copy the full SHA 6529a5bView commit details -
Configuration menu - View commit details
-
Copy full SHA for 76b4f66 - Browse repository at this point
Copy the full SHA 76b4f66View commit details -
* unpin torch * check * check * check --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Configuration menu - View commit details
-
Copy full SHA for 89439fe - Browse repository at this point
Copy the full SHA 89439feView commit details -
Configuration menu - View commit details
-
Copy full SHA for a1afec9 - Browse repository at this point
Copy the full SHA a1afec9View commit details -
[Docs] Fix backticks in inline code and documentation links (#28875)
Fix backticks in code blocks and documentation links
Configuration menu - View commit details
-
Copy full SHA for 4830f26 - Browse repository at this point
Copy the full SHA 4830f26View commit details -
Hotfix - make
torchaudio
get the correct version in `torch_and_flax……_job` (#28899) * check * check * check --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Configuration menu - View commit details
-
Copy full SHA for 40658be - Browse repository at this point
Copy the full SHA 40658beView commit details -
[Docs] Add missing language options and fix broken links (#28852)
* Add missing entries to the language selector * Add links to the Colab and AWS Studio notebooks for ONNX * Use anchor links in CONTRIBUTING.md * Fix broken hyperlinks due to spaces * Fix links to OpenAI research articles * Remove confusing footnote symbols from author names, as they are also considered invalid markup
Configuration menu - View commit details
-
Copy full SHA for 1c31b7a - Browse repository at this point
Copy the full SHA 1c31b7aView commit details
Commits on Feb 7, 2024
-
fix: Fixed the documentation for
logging_first_step
by removing "ev……aluate" (#28884) Fixed the documentation for logging_first_step by removing evaluate.
Configuration menu - View commit details
-
Copy full SHA for 64d1518 - Browse repository at this point
Copy the full SHA 64d1518View commit details -
Configuration menu - View commit details
-
Copy full SHA for d9deddb - Browse repository at this point
Copy the full SHA d9deddbView commit details -
Fix Keras scheduler import so it works for older versions of Keras (#…
…28895) Fix our schedule import so it works for older versions of Keras
Configuration menu - View commit details
-
Copy full SHA for 349a6e8 - Browse repository at this point
Copy the full SHA 349a6e8View commit details -
⚠️ RaiseException
when trying to generate 0 tokens⚠️ (#28621)* change warning to exception * Update src/transformers/generation/utils.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * validate `max_new_tokens` > 0 in `GenerationConfig` * fix truncation test parameterization in `TextGenerationPipelineTests` --------- Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
Configuration menu - View commit details
-
Copy full SHA for abf8f54 - Browse repository at this point
Copy the full SHA abf8f54View commit details -
Update the cache number (#28905)
* fix * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Configuration menu - View commit details
-
Copy full SHA for 308d2b9 - Browse repository at this point
Copy the full SHA 308d2b9View commit details -
Add npu device for pipeline (#28885)
add npu device for pipeline Co-authored-by: unit_test <test@unit.com>
Configuration menu - View commit details
-
Copy full SHA for 5f96855 - Browse repository at this point
Copy the full SHA 5f96855View commit details
Commits on Feb 8, 2024
-
[Docs] Fix placement of tilde character (#28913)
Fix placement of tilde character
Configuration menu - View commit details
-
Copy full SHA for 328ade8 - Browse repository at this point
Copy the full SHA 328ade8View commit details -
Configuration menu - View commit details
-
Copy full SHA for 33df036 - Browse repository at this point
Copy the full SHA 33df036View commit details -
Fix utf-8 yaml load for marian conversion to pytorch in Windows (#28618)
Fix utf-8 yaml in marian conversion
Configuration menu - View commit details
-
Copy full SHA for 4b236ae - Browse repository at this point
Copy the full SHA 4b236aeView commit details -
[
Core generation
] Adds support for static KV cache (#27931)Co-authored-by: fxmarty <9808326+fxmarty@users.noreply.github.com> Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
Configuration menu - View commit details
-
Copy full SHA for 115ac94 - Browse repository at this point
Copy the full SHA 115ac94View commit details -
Configuration menu - View commit details
-
Copy full SHA for 693667b - Browse repository at this point
Copy the full SHA 693667bView commit details -
Configuration menu - View commit details
-
Copy full SHA for 0b693e9 - Browse repository at this point
Copy the full SHA 0b693e9View commit details -
Configuration menu - View commit details
-
Copy full SHA for cc309fd - Browse repository at this point
Copy the full SHA cc309fdView commit details -
Support batched input for decoder start ids (#28887)
* support batched input for decoder start ids * Fix typos Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * minor changes * fix: decoder_start_id as list * empty commit * empty commit * empty commit * empty commit * empty commit * empty commit * empty commit * empty commit * empty commit --------- Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
Configuration menu - View commit details
-
Copy full SHA for d628664 - Browse repository at this point
Copy the full SHA d628664View commit details -
[Docs] Fix broken links and syntax issues (#28918)
* Fix model documentation links in attention.md * Fix external link syntax * Fix target anchor names of section links * Fix copyright statement comments * Fix documentation headings
Configuration menu - View commit details
-
Copy full SHA for 2749e47 - Browse repository at this point
Copy the full SHA 2749e47View commit details
Commits on Feb 9, 2024
-
Fix max_position_embeddings default value for llama2 to 4096 #28241 (#…
…28754) * Changed max_position_embeddings default value from 2048 to 4096 * force push * Fixed formatting issues. Fixed missing argument in write_model. * Reverted to the default value 2048 in the Llama config. Added comments for the llama_version argument. * Fixed issue with default value value of max_position_embeddings in docstring * Updated help message for llama versions Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
Configuration menu - View commit details
-
Copy full SHA for de11e65 - Browse repository at this point
Copy the full SHA de11e65View commit details -
Configuration menu - View commit details
-
Copy full SHA for ebf3ea2 - Browse repository at this point
Copy the full SHA ebf3ea2View commit details -
Fix type annotations on neftune_noise_alpha and fsdp_config TrainingA…
…rguments parameters (#28942)
Configuration menu - View commit details
-
Copy full SHA for d123e66 - Browse repository at this point
Copy the full SHA d123e66View commit details -
[i18n-de] Translate README.md to German (#28933)
* Translate README.md to German * Add links to README_de.md * Remove invisible characters in README * Change to a formal tone and fix punctuation marks
Configuration menu - View commit details
-
Copy full SHA for 58e3d23 - Browse repository at this point
Copy the full SHA 58e3d23View commit details
Commits on Feb 12, 2024
-
[Nougat] Fix pipeline (#28242)
* Fix pipeline * Remove print statements * Address comments * Address issue * Remove unused imports
Configuration menu - View commit details
-
Copy full SHA for f278ef2 - Browse repository at this point
Copy the full SHA f278ef2View commit details -
[Docs] Update README and default pipelines (#28864)
* Update README and docs * Update README * Update README
Configuration menu - View commit details
-
Copy full SHA for ef5ab72 - Browse repository at this point
Copy the full SHA ef5ab72View commit details -
Convert
torch_dtype
asstr
to actual torch data type (i.e. "float……16" …to `torch.float16`) (#28208) * Convert torch_dtype as str to actual torch data type (i.e. "float16" to torch.float16) * Check if passed torch_dtype is an attribute in torch * Update src/transformers/pipelines/__init__.py Check type via isinstance Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
Configuration menu - View commit details
-
Copy full SHA for cf4c20b - Browse repository at this point
Copy the full SHA cf4c20bView commit details -
[
pipelines
] updated docstring with vqa alias (#28951)updated docstring with vqa alias
Configuration menu - View commit details
-
Copy full SHA for 1709886 - Browse repository at this point
Copy the full SHA 1709886View commit details -
Configuration menu - View commit details
-
Copy full SHA for e30bbb2 - Browse repository at this point
Copy the full SHA e30bbb2View commit details -
Updated requirements for image-classification samples: datasets>=2.14…
….0 (#28974) Updated datasets requirements. Need a package version >= 2.14.0
Configuration menu - View commit details
-
Copy full SHA for 792819f - Browse repository at this point
Copy the full SHA 792819fView commit details -
Always initialize tied output_embeddings if it has a bias term (#28947)
Continue to initialize tied output_embeddings if it has a bias term The bias term is not tied, and so will need to be initialized accordingly.
Configuration menu - View commit details
-
Copy full SHA for 136cd89 - Browse repository at this point
Copy the full SHA 136cd89View commit details -
Clean up staging tmp checkpoint directory (#28848)
clean up remaining tmp checkpoint dir Signed-off-by: woshiyyya <xiaoyunxuan1998@gmail.com>
Configuration menu - View commit details
-
Copy full SHA for c617f98 - Browse repository at this point
Copy the full SHA c617f98View commit details -
[Docs] Add language identifiers to fenced code blocks (#28955)
Add language identifiers to code blocks
Configuration menu - View commit details
-
Copy full SHA for fe3df9d - Browse repository at this point
Copy the full SHA fe3df9dView commit details -
Configuration menu - View commit details
-
Copy full SHA for 78ba9f4 - Browse repository at this point
Copy the full SHA 78ba9f4View commit details -
[i18n-de] Translate CONTRIBUTING.md to German (#28954)
* Translate contributing.md to German * Fix formatting issues in contributing.md * Address review comments * Fix capitalization
Configuration menu - View commit details
-
Copy full SHA for d90acc1 - Browse repository at this point
Copy the full SHA d90acc1View commit details
Commits on Feb 13, 2024
-
[
NllbTokenizer
] refactor with added tokens decoder (#27717)* refactor with addedtokens decoder * style * get rid of lang code to id * style * keep some things for BC * update tests * add the mask token at the end of the vocab * nits * nits * fix final tests * style * nits * Update src/transformers/models/nllb/tokenization_nllb_fast.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * nits * style? * Update src/transformers/convert_slow_tokenizer.py * make it a tad bit more custom * ruff please stop Co-Authored by avidale <dale.david@mail.ru> * Update Co-authored-by: avidale <dale.david@mail.ru> * Update Co-authored-by: avidale <dale.david@mail.ru> * oupts * ouft * nites * test * fix the remaining failing tests * style * fix failing test * ficx other test * temp dir + test the raw init * update test * style --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
Configuration menu - View commit details
-
Copy full SHA for b445675 - Browse repository at this point
Copy the full SHA b445675View commit details -
Add sudachi_projection option to BertJapaneseTokenizer (#28503)
* add sudachi_projection option * Upgrade sudachipy>=0.6.8 * add a test case for sudachi_projection * Compatible with older versions of SudachiPy * make fixup * make style * error message for unidic download * revert jumanpp test cases * format options for sudachi_projection Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * format options for sudachi_split_mode and sudachi_dict_type * comment * add tests for full_tokenizer kwargs * pass projection arg directly * require_sudachi_projection * make style * revert upgrade sudachipy * check is_sudachi_projection_available() * revert dependency_version_table and bugfix * style format * simply raise ImportError Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * simply raise ImportError --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
Configuration menu - View commit details
-
Copy full SHA for da20209 - Browse repository at this point
Copy the full SHA da20209View commit details -
Configuration menu - View commit details
-
Copy full SHA for 3e70a20 - Browse repository at this point
Copy the full SHA 3e70a20View commit details -
Update configuration_llama.py: fixed broken link (#28946)
* Update configuration_llama.py: fix broken link * [Nit] Explicit redirection not required Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
Configuration menu - View commit details
-
Copy full SHA for 3de6a6b - Browse repository at this point
Copy the full SHA 3de6a6bView commit details -
[
DETR
] Update the processing to adapt masks & bboxes to reflect pad……ding (#28363) * Update the processing so bbox coords are adjusted for padding * Just pad masks * Tidy up, add tests * Better tests * Fix yolos and mark as slow for pycocotols * Fix yolos - return_tensors * Clarify padding and normalization behaviour
Configuration menu - View commit details
-
Copy full SHA for bd4b83e - Browse repository at this point
Copy the full SHA bd4b83eView commit details
Commits on Feb 14, 2024
-
ENH: Do not pass warning message in case
quantization_config
is in ……config but not passed as an arg (#28988) * Update auto.py * Update auto.py * Update src/transformers/quantizers/auto.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/quantizers/auto.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
Configuration menu - View commit details
-
Copy full SHA for 1d12b8b - Browse repository at this point
Copy the full SHA 1d12b8bView commit details -
ENH [
AutoQuantizer
]: enhance trainer + not supported quant methods (#……28991) * enhance trainer + not support quant methods * remove all old logic * add version
Configuration menu - View commit details
-
Copy full SHA for 164bdef - Browse repository at this point
Copy the full SHA 164bdefView commit details -
* Add `StableLM` * fix(model): re-create from `huggingface-cli add-new-model-like persimmon` * fix: re-add changes to address comments * fix(readme): add links to paper * fix(tokenization_auto): remove `GPTNeoXTokenizerFastFast` ref * fix(tests): re-add `@slow` decorator to integration tests * fix(tests): import slow... * fix(readme_hd): remove whitespace edit * fix(tokenizer): auto tokenizer tuple * skip doctests for `modeling_stablelm`
Configuration menu - View commit details
-
Copy full SHA for de6029a - Browse repository at this point
Copy the full SHA de6029aView commit details -
Add SiglipForImageClassification and CLIPForImageClassification (#28952)
* First draft * Add CLIPForImageClassification * Remove scripts * Fix doctests
Configuration menu - View commit details
-
Copy full SHA for 63ffd56 - Browse repository at this point
Copy the full SHA 63ffd56View commit details -
AQLM quantizer support (#28928)
* aqlm init * calibration and dtypes * docs * Readme update * is_aqlm_available * Simpler link in docs * Test TODO real reference * init _import_structure fix * AqlmConfig autodoc * integration aqlm * integrations in tests * docstring fix * legacy typing * Less typings * More kernels information * Performance -> Accuracy * correct tests * remoced multi-gpu test * Update docs/source/en/quantization.md Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * Update src/transformers/utils/quantization_config.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Brought back multi-gpu tests * Update src/transformers/integrations/aqlm.py Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> * Update tests/quantization/aqlm_integration/test_aqlm.py Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> --------- Co-authored-by: Andrei Panferov <blacksamorez@yandex-team.ru> Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
Configuration menu - View commit details
-
Copy full SHA for 1ecf5f7 - Browse repository at this point
Copy the full SHA 1ecf5f7View commit details -
[
Doc
] Fix docbuilder - makeBackboneMixin
and `BackboneConfigMixi……n` importable from `utils`. (#29002) * Trigger doc build * Test removing references * Importable from utils * Trigger another run on a new commit for testing
Configuration menu - View commit details
-
Copy full SHA for 7252e8d - Browse repository at this point
Copy the full SHA 7252e8dView commit details -
Set the dataset format used by
test_trainer
to float32 (#28920)Co-authored-by: unit_test <test@unit.com>
Configuration menu - View commit details
-
Copy full SHA for 69ca640 - Browse repository at this point
Copy the full SHA 69ca640View commit details -
Introduce AcceleratorConfig dataclass (#28664)
* Introduce acceleratorconfig dataclass * Extra second warn * Move import * Try moving import under is_accelerate_available * Quality * Apply suggestions from code review Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Clean * Remove to_kwargs * Change version * Improve tests by including dispatch and split batches * Improve reliability * Update tests/trainer/test_trainer.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Fixup tests and review nits * Make tests pass * protect import * Protect import * Empty-Commit * Make training_args.to_dict handle the AcceleratorConfig --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
Configuration menu - View commit details
-
Copy full SHA for 0507e69 - Browse repository at this point
Copy the full SHA 0507e69View commit details -
Configuration menu - View commit details
-
Copy full SHA for 354775b - Browse repository at this point
Copy the full SHA 354775bView commit details -
Mask Generation Task Guide (#28897)
* Create mask_generation.md * add h1 * add to toctree * Update docs/source/en/tasks/mask_generation.md Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update docs/source/en/tasks/mask_generation.md Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update docs/source/en/tasks/mask_generation.md Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update docs/source/en/tasks/mask_generation.md Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update docs/source/en/tasks/mask_generation.md Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update mask_generation.md * Update docs/source/en/tasks/mask_generation.md Co-authored-by: Maria Khalusova <kafooster@gmail.com> * Update docs/source/en/tasks/mask_generation.md Co-authored-by: Maria Khalusova <kafooster@gmail.com> * Update docs/source/en/tasks/mask_generation.md Co-authored-by: Maria Khalusova <kafooster@gmail.com> * Update docs/source/en/tasks/mask_generation.md Co-authored-by: Maria Khalusova <kafooster@gmail.com> * Update docs/source/en/tasks/mask_generation.md Co-authored-by: Maria Khalusova <kafooster@gmail.com> * Update docs/source/en/tasks/mask_generation.md Co-authored-by: Maria Khalusova <kafooster@gmail.com> * Update docs/source/en/tasks/mask_generation.md Co-authored-by: Maria Khalusova <kafooster@gmail.com> * Update docs/source/en/tasks/mask_generation.md Co-authored-by: Maria Khalusova <kafooster@gmail.com> * Update docs/source/en/tasks/mask_generation.md Co-authored-by: Maria Khalusova <kafooster@gmail.com> * Update docs/source/en/tasks/mask_generation.md Co-authored-by: Maria Khalusova <kafooster@gmail.com> * Update mask_generation.md * Update docs/source/en/tasks/mask_generation.md Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update docs/source/en/tasks/mask_generation.md Co-authored-by: Klaus Hipp <khipp@users.noreply.github.com> * Update docs/source/en/tasks/mask_generation.md Co-authored-by: Klaus Hipp <khipp@users.noreply.github.com> * Update docs/source/en/tasks/mask_generation.md Co-authored-by: Klaus Hipp <khipp@users.noreply.github.com> * Update docs/source/en/tasks/mask_generation.md Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update docs/source/en/tasks/mask_generation.md * Update mask_generation.md * Update mask_generation.md --------- Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: Maria Khalusova <kafooster@gmail.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by: Klaus Hipp <khipp@users.noreply.github.com>
Configuration menu - View commit details
-
Copy full SHA for 3f4e79d - Browse repository at this point
Copy the full SHA 3f4e79dView commit details -
Add tie_weights() to LM heads and set bias in set_output_embeddings() (…
…#28948) * Add tie_weights() to LM heads and set bias in set_output_embeddings() The bias were not tied correctly in some LM heads, and this change should fix that. * Moving test_save_and_load_low_cpu_mem_usage to ModelTesterMixin * Adding _tie_weights() to MPNet and Vilt * Skip test for low cpu mem usage for Deta/DeformableDetr since they cannot init on meta device * Rename to test name to save_load to match the convention
Configuration menu - View commit details
-
Copy full SHA for 725f4ad - Browse repository at this point
Copy the full SHA 725f4adView commit details -
Backbone kwargs in config (#28784)
* Enable instantiating model with pretrained backbone weights * Clarify pretrained import * Use load_backbone instead * Add backbone_kwargs to config * Pass kwargs to constructors * Fix up * Input verification * Add tests * Tidy up * Update tests/utils/test_backbone_utils.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
Configuration menu - View commit details
-
Copy full SHA for 0199a48 - Browse repository at this point
Copy the full SHA 0199a48View commit details -
[TPU] Support PyTorch/XLA FSDP via SPMD (#28949)
* Initial commit * Add guards for the global mesh * Address more comments * Move the dataloader into integrations/tpu.py * Fix linters * Make karg more explicitly * Remove the move device logic * Fix the CI * Fix linters * Re-enable checkpointing
Configuration menu - View commit details
-
Copy full SHA for 5f06053 - Browse repository at this point
Copy the full SHA 5f06053View commit details -
FIX [
Trainer
/ tags]: Fix trainer + tags when users do not pass `"t……ags"` to `trainer.push_to_hub()` (#29009) * fix trainer tags * add test
Configuration menu - View commit details
-
Copy full SHA for 7a0fccc - Browse repository at this point
Copy the full SHA 7a0fcccView commit details -
[
CLeanup
] Revert SDPA attention changes that got in the static kv c……ache PR (#29027) * revert unrelated changes that got in * style
Configuration menu - View commit details
-
Copy full SHA for 609a176 - Browse repository at this point
Copy the full SHA 609a176View commit details
Commits on Feb 15, 2024
-
Fix static generation when compiling! (#28937)
* wow I was scared! * fix everything * nits * make it BC? * add todo * nits * is_tracing should still be used to pass tracing tests * nits * some nits to make sure genration works with static cache uncompiled * fix sdpa * fix FA2 for both static and dynamic in a better way? * style * fix-copies * fix fix copies * fix sequential beam searcg * style * use `keys_to_ignore` * nit * correct dtype inference when init * :( the fix for FA2 is still not optimal to investigate! * styling * nits * nit * this might work better * add comment * Update src/transformers/models/llama/modeling_llama.py * "position_ids" -> "cache_position" * style * nit * Remove changes that should no be propagatted just yet * Apply suggestions from code review * Styling * make sure we raise an errir for static cache with FA2 enabled * move to the bottom of the signature * style * Update src/transformers/models/llama/modeling_llama.py Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * Update src/transformers/models/llama/modeling_llama.py * nit in the name --------- Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
Configuration menu - View commit details
-
Copy full SHA for f3788b0 - Browse repository at this point
Copy the full SHA f3788b0View commit details -
Add cuda_custom_kernel in DETA (#28989)
* enable graident checkpointing in DetaObjectDetection * fix missing part in original DETA * make style * make fix-copies * Revert "make fix-copies" This reverts commit 4041c86. * remove fix-copies of DetaDecoder * enable swin gradient checkpointing * fix gradient checkpointing in donut_swin * add tests for deta/swin/donut * Revert "fix gradient checkpointing in donut_swin" This reverts commit 1cf345e. * change supports_gradient_checkpointing pipeline to PreTrainedModel * Revert "add tests for deta/swin/donut" This reverts commit 6056ffb. * Revert "Revert "fix gradient checkpointing in donut_swin"" This reverts commit 24e25d0. * Simple revert * enable deformable detr gradient checkpointing * add gradient in encoder * add cuda_custom_kernel function in MSDA * make style and fix input of DetaMSDA * make fix-copies * remove n_levels in input of DetaMSDA * minor changes * refactor custom_cuda_kernel like yoso format https://github.com/huggingface/transformers/blob/0507e69d34f8902422eb4977ec066dd6bef179a0/src/transformers/models/yoso/modeling_yoso.py#L53
Configuration menu - View commit details
-
Copy full SHA for 83e96dc - Browse repository at this point
Copy the full SHA 83e96dcView commit details -
DeformableDetrModel support fp16 (#29013)
* Update ms_deform_attn_cuda.cu * Update ms_deform_attn_cuda.cuh * Update modeling_deformable_detr.py * Update src/transformers/models/deformable_detr/modeling_deformable_detr.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update modeling_deformable_detr.py * python utils/check_copies.py --fix_and_overwrite * Fix dtype missmatch error * Update test_modeling_deformable_detr.py * Update test_modeling_deformable_detr.py * Update modeling_deformable_detr.py * Update modeling_deformable_detr.py --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
Configuration menu - View commit details
-
Copy full SHA for 5b6fa23 - Browse repository at this point
Copy the full SHA 5b6fa23View commit details -
Configuration menu - View commit details
-
Copy full SHA for 8a0ed0a - Browse repository at this point
Copy the full SHA 8a0ed0aView commit details -
FIX: Fix error with
logger.warning
+ inline with recent refactor (#……29039) Update modeling_utils.py
Configuration menu - View commit details
-
Copy full SHA for 6d1f545 - Browse repository at this point
Copy the full SHA 6d1f545View commit details -
Patch to skip failing
test_save_load_low_cpu_mem_usage
tests (#29043)* Patch to skip currently failing tests * Whoops - wrong place
Configuration menu - View commit details
-
Copy full SHA for 4156f51 - Browse repository at this point
Copy the full SHA 4156f51View commit details -
Removed obsolete attribute setting for AQLM quantization. (#29034)
removed redundant field
Andrei Panferov authoredFeb 15, 2024 Configuration menu - View commit details
-
Copy full SHA for b0a7f44 - Browse repository at this point
Copy the full SHA b0a7f44View commit details -
Fix a tiny typo in `generation/utils.py::GenerateEncoderDecoderOutput…
…`'s docstring (#29044) Update utils.py
Configuration menu - View commit details
-
Copy full SHA for f3aa7db - Browse repository at this point
Copy the full SHA f3aa7dbView commit details
Commits on Feb 16, 2024
-
Configuration menu - View commit details
-
Copy full SHA for 1e402b9 - Browse repository at this point
Copy the full SHA 1e402b9View commit details -
Update all references to canonical models (#29001)
* Script & Manual edition * Update
Configuration menu - View commit details
-
Copy full SHA for f497f56 - Browse repository at this point
Copy the full SHA f497f56View commit details -
Configuration menu - View commit details
-
Copy full SHA for 8876ce8 - Browse repository at this point
Copy the full SHA 8876ce8View commit details -
Fix max_length criteria when using inputs_embeds (#28994)
* fix max_length for inputs_embeds * make style * Update src/transformers/generation/utils.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Static Cache: load models with MQA or GQA (#28975) * fix * fix tests * fix tests * Update src/transformers/generation/utils.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * more fixes * make style --------- Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
Configuration menu - View commit details
-
Copy full SHA for aee11fe - Browse repository at this point
Copy the full SHA aee11feView commit details -
Support : Leverage Accelerate for object detection/segmentation models (
#28312) * made changes for object detection models * added support for segmentation models. * Made changes for segmentaion models * Changed import statements * solving conflicts * removed conflicts * Resolving commits * Removed conflicts * Fix : Pixel_mask_value set to False
Configuration menu - View commit details
-
Copy full SHA for 0eb4085 - Browse repository at this point
Copy the full SHA 0eb4085View commit details -
fix num_assistant_tokens with heuristic schedule (#28759)
* fix heuristic num_assistant_tokens_schedule * Update src/transformers/generation/configuration_utils.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Update src/transformers/generation/candidate_generator.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Update utils.py check that candidate_generator.assistant_model exists since some some speculations (like ngram and PLD) don't have assistant_model attribute * Update src/transformers/generation/candidate_generator.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update tests/generation/test_utils.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * make fixup * merge conflict * fix docstring * make fixup --------- Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
Configuration menu - View commit details
-
Copy full SHA for 258da40 - Browse repository at this point
Copy the full SHA 258da40View commit details -
Configuration menu - View commit details
-
Copy full SHA for b262808 - Browse repository at this point
Copy the full SHA b262808View commit details -
auto_find_batch_size
isn't yet supported with DeepSpeed/FSDP. Raise…… error accrodingly. (#29058) Update trainer.py
Configuration menu - View commit details
-
Copy full SHA for 4c18ddb - Browse repository at this point
Copy the full SHA 4c18ddbView commit details -
Honor trust_remote_code for custom tokenizers (#28854)
* pass through trust_remote_code for dynamically loading unregistered tokenizers specified by config add test * change directories back to previous directory after test * fix ruff check * Add a note to that block for future in case we want to remove it later --------- Co-authored-by: Matt <rocketknight1@gmail.com>
Configuration menu - View commit details
-
Copy full SHA for be42c24 - Browse repository at this point
Copy the full SHA be42c24View commit details -
Feature: Option to set the tracking URI for MLflowCallback. (#29032)
* Added option to set tracking URI for MLflowCallback. * Added option to set tracking URI for MLflowCallback. * Changed to in docstring.
Configuration menu - View commit details
-
Copy full SHA for 161fe42 - Browse repository at this point
Copy the full SHA 161fe42View commit details -
Fix trainer test wrt DeepSpeed + auto_find_bs (#29061)
* FIx trainer test * Update tests/trainer/test_trainer.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
Configuration menu - View commit details
-
Copy full SHA for 636b032 - Browse repository at this point
Copy the full SHA 636b032View commit details -
Add chat support to text generation pipeline (#28945)
* Add chat support to text generation pipeline * Better handling of single elements * Deprecate ConversationalPipeline * stash commit * Add missing add_special_tokens kwarg * Update chat templating docs to refer to TextGenerationPipeline instead of ConversationalPipeline * Add ✨TF✨ tests * @require_tf * Add type hint * Add specific deprecation version * Remove unnecessary do_sample * Remove todo - the discrepancy has been resolved * Update src/transformers/tokenization_utils_base.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/pipelines/text_generation.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
Configuration menu - View commit details
-
Copy full SHA for 2f1003b - Browse repository at this point
Copy the full SHA 2f1003bView commit details -
[Docs] Spanish translation of task_summary.md (#28844)
* Add task_summary to es/_toctree.yml * Add task_summary.md to docs/es * Change title of task_summary.md * Translate firsts paragraphs * Translate middle paragraphs * Translte the rest of the doc * Edit firts paragraph
Configuration menu - View commit details
-
Copy full SHA for ce4fff0 - Browse repository at this point
Copy the full SHA ce4fff0View commit details
Commits on Feb 19, 2024
-
[
Awq
] Add peft support for AWQ (#28987)* add peft support for AWQ * Update src/transformers/quantizers/quantizer_awq.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * fix --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
Configuration menu - View commit details
-
Copy full SHA for 864c8e6 - Browse repository at this point
Copy the full SHA 864c8e6View commit details -
Configuration menu - View commit details
-
Copy full SHA for a75a6c9 - Browse repository at this point
Copy the full SHA a75a6c9View commit details -
fix the post-processing link (#29091)
The link in evaluation was missing a hyphen between post and processing. I fixed this, for English only. Someone with the ability to do a global search/replace should fix the other languages (if indeed they have this issue)/
Configuration menu - View commit details
-
Copy full SHA for 593230f - Browse repository at this point
Copy the full SHA 593230fView commit details -
Configuration menu - View commit details
-
Copy full SHA for 9830858 - Browse repository at this point
Copy the full SHA 9830858View commit details -
Configuration menu - View commit details
-
Copy full SHA for 79132d4 - Browse repository at this point
Copy the full SHA 79132d4View commit details -
* change version * nuke * this doesn't make sense * update some requirements.py * revert + no main * nits * change cache number * more pin * revert --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Configuration menu - View commit details
-
Copy full SHA for b2724d7 - Browse repository at this point
Copy the full SHA b2724d7View commit details -
* Add resource * Add more resources * Add resources * Apply suggestions from code review Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Remove mention * Remove pipeline tags --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
Configuration menu - View commit details
-
Copy full SHA for 07e3454 - Browse repository at this point
Copy the full SHA 07e3454View commit details -
ENH: added new output_logits option to generate function (#28667)
output_logits option behaves like output_scores, but returns the raw, unprocessed prediction logit scores, ie. the values before they undergo logit processing and/or warping. The latter happens by default for the regular output scores. It's useful to have the unprocessed logit scores in certain circumstances. For example, unprocessed logit scores are very useful with causallm models when one wants to determine the probability of a certain answer, e.g. when asking a question with a yes/no answer. In that case getting the next-token probabilities of both "yes" and "no" (and/or their relative ratio) is of interest for classification. The reason for getting these _before_ logit processing and/or warping is b/c a) that can change the probabilities or b) reject the tokens of interest / reduce the number of tokens to just 1. For an example use-case see paper TabLLM: Few-shot Classification of Tabular Data with Large Language Models by Stefan Hegselmann, Alejandro Buendia, Hunter Lang, Monica Agrawal, Xiaoyi Jiang, and David Sontag. https://arxiv.org/abs/2210.10723 In addition: - added dedicated unit test: tests/generation/test_utils/test_return_unprocessed_logit_scores which tests return of logics with output_logits=True in generation. - set output_logits=True in all other generation unit tests, that also have output_scores=True. Implemented @gante's and @amyeroberts review feedback Co-authored-by: kx79wq <max.baak@ing.com>
Configuration menu - View commit details
-
Copy full SHA for 08cd694 - Browse repository at this point
Copy the full SHA 08cd694View commit details -
Bnb test fix for different hardwares (#29066)
* generated text on A10G * generated text in CI * Apply suggestions from code review add explanatory comments Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> --------- Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
Configuration menu - View commit details
-
Copy full SHA for 5ce90f3 - Browse repository at this point
Copy the full SHA 5ce90f3View commit details -
Fix two tiny typos in `pipelines/base.py::Pipeline::_sanitize_paramet…
…ers()`'s docstring (#29102) * Update base.py * Fix a typo
Configuration menu - View commit details
-
Copy full SHA for a4851d9 - Browse repository at this point
Copy the full SHA a4851d9View commit details -
storing & logging gradient norm in trainer (#27326)
* report grad_norm during training * support getting grad_norm from deepspeed
Configuration menu - View commit details
-
Copy full SHA for 4f09d0f - Browse repository at this point
Copy the full SHA 4f09d0fView commit details
Commits on Feb 20, 2024
-
Fixed nll with label_smoothing to just nll (#28708)
* Fixed nll with label_smoothing to nll * Resolved conflict by rebase * Fixed nll with label_smoothing to nll * Resolved conflict by rebase * Added label_smoothing to config file * Fixed nits
Configuration menu - View commit details
-
Copy full SHA for 49c0b29 - Browse repository at this point
Copy the full SHA 49c0b29View commit details -
[
gradient_checkpointing
] default to use it for torch 2.3 (#28538)* default to use it * style
Configuration menu - View commit details
-
Copy full SHA for 9094abe - Browse repository at this point
Copy the full SHA 9094abeView commit details -
Configuration menu - View commit details
-
Copy full SHA for a7ff2f2 - Browse repository at this point
Copy the full SHA a7ff2f2View commit details -
FEAT [
Trainer
/bnb
]: Add RMSProp frombitsandbytes
to HF `Trai……ner` (#29082) * add RMSProp to Trainer * revert some change * Update src/transformers/trainer.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
Configuration menu - View commit details
-
Copy full SHA for f7ef7ce - Browse repository at this point
Copy the full SHA f7ef7ceView commit details -
Abstract image processor arg checks. (#28843)
* abstract image processor arg checks. * fix signatures and quality * add validate_ method to rescale-prone processors * add more validations * quality * quality * fix formatting Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * fix formatting Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * fix formatting Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Fix formatting mishap Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * fix crop_size compatibility * fix default mutable arg * fix segmentation map + image arg validity * remove segmentation check from arg validation * fix quality * fix missing segmap * protect PILImageResampling type * Apply suggestions from code review Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * add back segmentation maps check --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
Configuration menu - View commit details
-
Copy full SHA for 1c9134f - Browse repository at this point
Copy the full SHA 1c9134fView commit details -
FIX [
bnb
/tests
] Propagate the changes from #29092 to 4-bit tests (#29122) * forgot to push the changes for 4bit .. * trigger CI
Configuration menu - View commit details
-
Copy full SHA for ff76e7c - Browse repository at this point
Copy the full SHA ff76e7cView commit details -
Configuration menu - View commit details
-
Copy full SHA for 7d312ad - Browse repository at this point
Copy the full SHA 7d312adView commit details -
Configuration menu - View commit details
-
Copy full SHA for a7755d2 - Browse repository at this point
Copy the full SHA a7755d2View commit details -
[
cuda kernels
] only compile them when initializing (#29133)* only compile when needed * fix mra as well * fix yoso as well * update * rempve comment * Update src/transformers/models/deformable_detr/modeling_deformable_detr.py * Update src/transformers/models/deformable_detr/modeling_deformable_detr.py * opps * Update src/transformers/models/deta/modeling_deta.py * nit
Configuration menu - View commit details
-
Copy full SHA for 5e95dca - Browse repository at this point
Copy the full SHA 5e95dcaView commit details -
FIX [
PEFT
/Trainer
] Handle better peft + quantized compiled mod……els (#29055) * handle peft + compiled models * add tests * fixup * adapt from suggestions * clarify comment
Configuration menu - View commit details
-
Copy full SHA for efdd436 - Browse repository at this point
Copy the full SHA efdd436View commit details -
[
Core tokenization
]add_dummy_prefix_space
option to help with la……test issues (#28010) * add add_dummy_prefix_space option to slow * checking kwargs might be better. Should be there for all spm tokenizer IMO * nits * fix copies * more copied * nits * add prefix space * nit * nits * Update src/transformers/convert_slow_tokenizer.py * fix inti * revert wrong styling * fix * nits * style * updates * make sure we use slow tokenizer for conversion instead of looking for the decoder * support llama ast well * update llama tokenizer fast * nits * nits nits nits * update the doc * update * update to fix tests * skip unrelated tailing test * Update src/transformers/convert_slow_tokenizer.py * add proper testing * test decode as well * more testing * format * fix llama test * Apply suggestions from code review
Configuration menu - View commit details
-
Copy full SHA for 15cfe38 - Browse repository at this point
Copy the full SHA 15cfe38View commit details -
Configuration menu - View commit details
-
Copy full SHA for 0996a10 - Browse repository at this point
Copy the full SHA 0996a10View commit details -
Add support for fine-tuning CLIP-like models using contrastive-image-…
…text example (#29070) * add support for siglip and chinese-clip model training with contrastive-image-text example * codebase fixups
Configuration menu - View commit details
-
Copy full SHA for ee3af60 - Browse repository at this point
Copy the full SHA ee3af60View commit details -
Save (circleci) cache at the end of a job (#29141)
nice job Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Configuration menu - View commit details
-
Copy full SHA for 7688d8d - Browse repository at this point
Copy the full SHA 7688d8dView commit details -
Configuration menu - View commit details
-
Copy full SHA for b8b1647 - Browse repository at this point
Copy the full SHA b8b1647View commit details