Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ValueError: OLMoForCausalLM does not support Flash Attention 2.0 yet #29145 #1

Merged
merged 3,084 commits into from
Feb 20, 2024
This pull request is big! We’re only showing the most recent 250 commits.

Commits on Jan 17, 2024

  1. Add qwen2 (#28436)

    * add config, modeling, and tokenization
    
    * add auto and init
    
    * update readme
    
    * update readme
    
    * update team name
    
    * fixup
    
    * fixup
    
    * update config
    
    * update code style
    
    * update for fixup
    
    * update for fixup
    
    * update for fixup
    
    * update for testing
    
    * update for testing
    
    * fix bug for config and tokenization
    
    * fix bug for bos token
    
    * not doctest
    
    * debug tokenizer
    
    * not doctest
    
    * debug tokenization
    
    * debug init for tokenizer
    
    * fix style
    
    * update init
    
    * delete if in token auto
    
    * add tokenizer doc
    
    * add tokenizer in init
    
    * Update dummy_tokenizers_objects.py
    
    * update
    
    * update
    
    * debug
    
    * Update tokenization_qwen2.py
    
    * debug
    
    * Update convert_slow_tokenizer.py
    
    * add copies
    
    * add copied from and make style
    
    * update files map
    
    * update test
    
    * fix style
    
    * fix merge reading and update tests
    
    * fix tests
    
    * fix tests
    
    * fix style
    
    * debug a variable in readme
    
    * Update src/transformers/models/qwen2/configuration_qwen2.py
    
    Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
    
    * update test and copied from
    
    * fix style
    
    * update qwen2 tokenization  and tests
    
    * Update tokenization_qwen2.py
    
    * delete the copied from after property
    
    * fix style
    
    * update tests
    
    * update tests
    
    * add copied from
    
    * fix bugs
    
    * update doc
    
    * add warning for sliding window attention
    
    * update qwen2 tokenization
    
    * fix style
    
    * Update src/transformers/models/qwen2/modeling_qwen2.py
    
    Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
    
    * fix tokenizer fast
    
    ---------
    
    Co-authored-by: Ren Xuancheng <jklj077@users.noreply.github.com>
    Co-authored-by: renxuancheng.rxc <renxuancheng.rxc@alibaba-inc.com>
    Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
    4 people authored Jan 17, 2024
    Configuration menu
    Copy the full SHA
    d6ffe74 View commit details
    Browse the repository at this point in the history
  2. Fix SDPA tests (#28552)

    * skip bf16 test if not supported by device
    
    * fix
    
    * fix bis
    
    * use is_torch_bf16_available_on_device
    
    * use is_torch_fp16_available_on_device
    
    * fix & use public llama
    
    * use 1b model
    
    * fix flacky test
    
    ---------
    
    Co-authored-by: Your Name <you@example.com>
    fxmarty and Your Name authored Jan 17, 2024
    Configuration menu
    Copy the full SHA
    2c1eebc View commit details
    Browse the repository at this point in the history
  3. Allow to train dinov2 with different dtypes like bf16 (#28504)

    I want to train dinov2 with bf16 but I get the following error in https://github.com/huggingface/transformers/blob/bc72b4e2cdcbc80d5f56731f35dbc9c18b4c8de6/src/transformers/models/dinov2/modeling_dinov2.py#L635:
    
    ```
    RuntimeError: Input type (float) and bias type (c10::BFloat16) should be the same
    ```
    
    Since the input dtype is torch.float32, the parameter dtype has to be torch.float32...
    
    @LZHgrla and I checked the code of clip vision encoder and found there is an automatic dtype transformation (https://github.com/huggingface/transformers/blob/bc72b4e2cdcbc80d5f56731f35dbc9c18b4c8de6/src/transformers/models/clip/modeling_clip.py#L181-L182).
    
    So I add similar automatic dtype transformation to modeling_dinov2.py.
    StarCycle authored Jan 17, 2024
    Configuration menu
    Copy the full SHA
    fa6d12f View commit details
    Browse the repository at this point in the history
  4. Fix Switch Transformers When sparse_step = 1 (#28564)

    Fix sparse_step = 1
    
    I case sparse_step = 1, the current code will not work.
    agemagician authored Jan 17, 2024
    Configuration menu
    Copy the full SHA
    98dda8e View commit details
    Browse the repository at this point in the history

Commits on Jan 18, 2024

  1. Save Processor (#27761)

    * save processor
    
    * Update tests/models/auto/test_processor_auto.py
    
    Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
    
    * Update tests/test_processing_common.py
    
    Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
    
    * fix
    
    ---------
    
    Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
    Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
    3 people authored Jan 18, 2024
    Configuration menu
    Copy the full SHA
    3005f96 View commit details
    Browse the repository at this point in the history
  2. Use weights_only only if torch >= 1.13 (#28506)

    * fix
    
    * fix
    
    * fix
    
    ---------
    
    Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
    ydshieh and ydshieh authored Jan 18, 2024
    Configuration menu
    Copy the full SHA
    a1668cc View commit details
    Browse the repository at this point in the history
  3. [Core Tokenization] Support a fix for spm fast models (#26678)

    * fix
    
    * last attempt
    
    * current work
    
    * fix forward compatibility
    
    * save all special tokens
    
    * current state
    
    * revert additional changes
    
    * updates
    
    * remove tokenizer.model
    
    * add a test and the fix
    
    * nit
    
    * revert one more break
    
    * fix typefield issue
    
    * quality
    
    * more tests
    
    * fix fields for FC
    
    * more nits?
    
    * new additional changes
    
    * how
    
    * some updates
    
    * the fix
    
    * where do we stand
    
    * nits
    
    * nits
    
    * revert unrelated changes
    
    * nits nits nits
    
    * styling
    
    * don't break llama just yet
    
    * revert llama changes
    
    * safe arg check
    
    * fixup
    
    * Add a test for T5
    
    * Necessary changes
    
    * Tests passing, added tokens need to not be normalized. If the added tokens are normalized, it will the stripping which seems to be unwanted for a normal functioning
    
    * Add even more tests, when normalization is set to True (which does not work 😓 )
    
    * Add even more tests, when normalization is set to True (which does not work 😓 )
    
    * Update to main
    
    * nits
    
    * fmt
    
    * more and more test
    
    * comments
    
    * revert change as tests are failing
    
    * make the test more readble
    
    * nits
    
    * refactor the test
    
    * nit
    
    * updates
    
    * simplify
    
    * style
    
    * style
    
    * style convert slow
    
    * Update src/transformers/convert_slow_tokenizer.py
    ArthurZucker authored Jan 18, 2024
    Configuration menu
    Copy the full SHA
    8189977 View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    5d8eb93 View commit details
    Browse the repository at this point in the history
  5. Add new meta w2v2-conformer BERT-like model (#28165)

    * first commit
    
    * correct default value non causal
    
    * update config and modeling code
    
    * update converting checkpoint
    
    * clean modeling and fix tests
    
    * make style
    
    * add new config parameters to docstring
    
    * fix copied from statements
    
    * Apply suggestions from code review
    
    Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
    
    * make position_embeddings_type docstrings clearer
    
    * clean converting script
    
    * remove function not used
    
    * clean modeling file
    
    * apply suggestion for test file + add convert script to not_doctested
    
    * modify tests according to review - cleaner logic and more tests
    
    * Apply nit suggestions from code review
    
    Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
    
    * add checker of valid position embeddings type
    
    * instantiate new layer norm layer with the right eps
    
    * fix freeze_feature_encoder since it can be None in some cases
    
    * add test same output in convert script
    
    * restore wav2vec2conformer and add new model
    
    * create processor and FE + clean
    
    * add new model code
    
    * fix convert script and set default config parameters
    
    * correct model id paths
    
    * make style
    
    * make fix-copies and cleaning files
    
    * fix copied from statements
    
    * complete .md and fixe copies
    
    * clean convert script argument defaults
    
    * fix config parameters docstrings
    
    * fix config docstring
    
    * add copied from and enrich FE tests
    
    * fix copied from and repo-consistency
    
    * add autotokenizer
    
    * make test input length shorter and change docstring code
    
    * fix docstrings and copied from
    
    * add add_adapter to ASR training example
    
    * make testing of adapters more robust
    
    * adapt to multi adapter layers
    
    * refactor input_values->input_features and remove w2v2-bert feature extractor
    
    * remove pretraining model
    
    * remove depreciated features and useless lines
    
    * add copied from and ignore statements to modeling tests
    
    * remove pretraining model #2
    
    * change import in convert script
    
    * change default in convert script
    
    * update readme and remove useless line
    
    * Update tests/models/wav2vec2_bert/test_processor_wav2vec2_bert.py
    
    Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
    
    * refactor BERT to Bert for consistency
    
    * remove useless ignore copy statement
    
    * add persistent to buffer in rotary
    
    * add eps in LayerNorm init and remove copied from
    
    * add adapter activation parameters and add copied from statements
    
    * Fix copied statements and add unitest.skip reasons
    
    * add copied statement in test_processor
    
    * refactor processor
    
    * make style
    
    * replace numpy random by torch rand
    
    * remove expected output CTC
    
    * improve converting script with processor class
    
    * Apply suggestions from code review
    
    Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
    
    * remove gumbel class
    
    * remove tests related to previously deleted class
    
    * Update src/transformers/models/wav2vec2_bert/configuration_wav2vec2_bert.py
    
    Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
    
    * correct typos
    
    * remove uused parameters
    
    * update processor to takes both text and audio
    
    * update checkpoints
    
    * update expected output and add ctc expected output
    
    * add label_attention_mask
    
    * replace pt with np in processor tests
    
    * fix typo
    
    * revert to behaviour with labels_attention_mask
    
    ---------
    
    Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
    Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
    3 people authored Jan 18, 2024
    Configuration menu
    Copy the full SHA
    d2cdefb View commit details
    Browse the repository at this point in the history
  6. Use LoggingLevel context manager in 3 tests (#28575)

    * inside with LoggingLevel
    
    * remove is_flaky
    
    ---------
    
    Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
    ydshieh and ydshieh authored Jan 18, 2024
    Configuration menu
    Copy the full SHA
    0754217 View commit details
    Browse the repository at this point in the history
  7. Fix the documentation checkpoint for xlm-roberta-xl (#28567)

    * Fix the documentation checkpoint for xlm-roberta-xl
    
    * Improve docstring consistency
    jeremyfowers authored Jan 18, 2024
    Configuration menu
    Copy the full SHA
    c662c78 View commit details
    Browse the repository at this point in the history
  8. [ASR Pipe] Update init to set model type and subsequently call parent…

    … init method (#28486)
    
    * add image processor arg
    
    * super
    
    * rm args
    sanchit-gandhi authored Jan 18, 2024
    Configuration menu
    Copy the full SHA
    0eaa5ea View commit details
    Browse the repository at this point in the history
  9. [Whisper Tok] Move token ids to CPU when computing offsets (#28485)

    * move token ids to cpu
    
    * check for torch attr
    sanchit-gandhi authored Jan 18, 2024
    Configuration menu
    Copy the full SHA
    619ecfe View commit details
    Browse the repository at this point in the history
  10. Configuration menu
    Copy the full SHA
    186aa6b View commit details
    Browse the repository at this point in the history
  11. Making CTC training example more general (#28582)

    * add w2v2bert compatibility
    
    * Update examples/pytorch/speech-recognition/run_speech_recognition_ctc.py
    
    Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
    
    ---------
    
    Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
    ylacombe and amyeroberts authored Jan 18, 2024
    Configuration menu
    Copy the full SHA
    772307b View commit details
    Browse the repository at this point in the history

Commits on Jan 19, 2024

  1. Don't save processor_config.json if a processor has no extra attrib…

    …ute (#28584)
    
    * not save if empty
    
    * fix
    
    * fix
    
    * fix
    
    * fix
    
    * fix
    
    ---------
    
    Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
    ydshieh and ydshieh authored Jan 19, 2024
    Configuration menu
    Copy the full SHA
    db9a7e9 View commit details
    Browse the repository at this point in the history
  2. v4.38.dev.0

    amyeroberts committed Jan 19, 2024
    Configuration menu
    Copy the full SHA
    b2748a6 View commit details
    Browse the repository at this point in the history
  3. Add w2v2bert to pipeline (#28585)

    * generalize asr pipeline to fbank models
    
    * change w2v2 pipeline output
    
    * Update test_pipelines_automatic_speech_recognition.py
    ylacombe authored Jan 19, 2024
    Configuration menu
    Copy the full SHA
    268fc1f View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    d4fc1eb View commit details
    Browse the repository at this point in the history
  5. [Whisper] Finalize batched SOTA long-form generation (#27658)

    * finalize
    
    * make fix copies whisper
    
    * [Tests] Make sure that we don't run tests mulitple times
    
    * Update src/transformers/models/whisper/modeling_whisper.py
    
    * [Tests] Make sure that we don't run tests mulitple times
    
    * fix more
    
    * improve
    
    * improve
    
    * improve further
    
    * improve more
    
    * improve
    
    * fix more
    
    * git commit and git push
    
    * fix more
    
    * fix more
    
    * fix more
    
    * New try
    
    * Fix more whisper stuff
    
    * Improve
    
    * correct more
    
    * correct more
    
    * correct more
    
    * Fix some tests
    
    * Add more tests
    
    * correct more
    
    * correct more
    
    * correct more
    
    * push
    
    * correct more
    
    * Fix more
    
    * Better
    
    * without dec mask
    
    * correct more
    
    * clean
    
    * save intermediate
    
    * Fix more
    
    * Fix VAD for large-v2
    
    * Save new
    
    * Correct more
    
    * make cleaner
    
    * correct tests
    
    * correct src
    
    * Finish
    
    * Fix more
    
    * Fix more
    
    * finish
    
    * Fix edge cases
    
    * fix return_dict_in_generate
    
    * fix all tests
    
    * make style
    
    * add docstrings
    
    * add docstrings
    
    * Fix logit processor
    
    * make style
    
    * fix pipeline test
    
    * fix more style
    
    * Apply suggestions from code review
    
    * apply feedback Sanchit
    
    * correct more
    
    * Apply suggestions from code review
    
    Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
    
    * Apply suggestions from code review
    
    Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
    Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
    
    * correct more
    
    * correct more
    
    * correct more
    
    * Fix staticmethod
    
    * correct more
    
    * fix
    
    * fix slow tests
    
    * make style
    
    * fix tokenizer test
    
    * fix tokenizer test
    
    * Apply suggestions from code review
    
    Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
    
    * finish
    
    * finish
    
    * revert kwargs change
    
    ---------
    
    Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
    Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
    Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
    4 people authored Jan 19, 2024
    Configuration menu
    Copy the full SHA
    690fe73 View commit details
    Browse the repository at this point in the history
  6. Fix wrong xpu device in DistributedType.MULTI_XPU mode (#28386)

    * remove elif xpu
    
    * remove redudant code
    faaany authored Jan 19, 2024
    Configuration menu
    Copy the full SHA
    8db6436 View commit details
    Browse the repository at this point in the history
  7. [SigLIP] Don't pad by default (#28578)

    First draft
    NielsRogge authored Jan 19, 2024
    Configuration menu
    Copy the full SHA
    faf0354 View commit details
    Browse the repository at this point in the history
  8. [Llava] Fix convert_llava_weights_to_hf.py script (#28570)

    * Update convert_llava_weights_to_hf.py
    
    Fix call to `tokenizer.add_tokens`
    
    * Add special_tokens to tokenizer.add_tokens in convert_vipllava_weights_to_hf.py
    isaac-vidas authored Jan 19, 2024
    Configuration menu
    Copy the full SHA
    5b7f4bc View commit details
    Browse the repository at this point in the history
  9. Allow add_tokens for ESM (#28535)

    * Allow non-special tokens to be added
    
    * Add test, fix token adding code
    
    * Revert changes to id_to_token and token_to_id
    
    * Update the ESM tokenizer to be a bit more standardized
    
    * Update src/transformers/models/esm/tokenization_esm.py
    
    Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
    
    ---------
    
    Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
    Rocketknight1 and ArthurZucker authored Jan 19, 2024
    Configuration menu
    Copy the full SHA
    d157815 View commit details
    Browse the repository at this point in the history
  10. Configuration menu
    Copy the full SHA
    9efec11 View commit details
    Browse the repository at this point in the history
  11. Configuration menu
    Copy the full SHA
    948ffff View commit details
    Browse the repository at this point in the history
  12. Fix auxiliary loss related code in transformers (#28406)

    * [DETA] fix freeze/unfreeze function
    
    * Update src/transformers/models/deta/modeling_deta.py
    
    Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
    
    * Update src/transformers/models/deta/modeling_deta.py
    
    Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
    
    * add freeze/unfreeze test case in DETA
    
    * fix type
    
    * fix typo 2
    
    * fix : enable aux and enc loss in training pipeline
    
    * Add unsynced variables from original DETA for training
    
    * modification for passing CI test
    
    * make style
    
    * make fix
    
    * manual make fix
    
    * change deta_modeling_test of configuration 'two_stage' default to TRUE and minor change of dist checking
    
    * remove print
    
    * divide configuration in DetaModel and DetaForObjectDetection
    
    * image smaller size than 224 will give topk error
    
    * pred_boxes and logits should be equivalent to two_stage_num_proposals
    
    * add missing part in DetaConfig
    
    * Update src/transformers/models/deta/modeling_deta.py
    
    Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
    
    * add docstring in configure and prettify TO DO part
    
    * change distribute related code to accelerate
    
    * Update src/transformers/models/deta/configuration_deta.py
    
    Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
    
    * Update tests/models/deta/test_modeling_deta.py
    
    Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
    
    * protect importing accelerate
    
    * change variable name to specific value
    
    * wrong import
    
    * fix aux_loss in conditional_detr
    
    * add test aux_loss
    
    * add aux_loss test in deta and table_transformer
    
    * fix yolos since it doesn't have auxiliary function
    
    * fix maskformer auxiliary_loss related code
    
    * make style
    
    * change param 'auxiliary_loss' to 'use_auxiliary_loss'
    
    * change param 'auxiliary_loss' to 'use_auxiliary_loss' in tests
    
    * make style & fix-copies, also revert yolos related parameter
    
    * revert variable name 'use_auxiliary_loss' to 'auxiliary_loss' due to DetrConfig
    
    * revert variable name in yolos
    
    * revert maskformer
    
    * add aux_loss test in maskformer
    
    * make style
    
    * Update src/transformers/models/yolos/configuration_yolos.py
    
    Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
    
    ---------
    
    Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
    Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
    3 people authored Jan 19, 2024
    Configuration menu
    Copy the full SHA
    3f69f41 View commit details
    Browse the repository at this point in the history

Commits on Jan 21, 2024

  1. [GPTNeoX] Fix BC issue with 4.36 (#28602)

    * fix dtype issue
    
    * add a test
    
    * update copied from mentions
    
    * nits
    
    * fixup
    
    * fix copies
    
    * Apply suggestions from code review
    ArthurZucker authored Jan 21, 2024
    Configuration menu
    Copy the full SHA
    83f9196 View commit details
    Browse the repository at this point in the history

Commits on Jan 22, 2024

  1. Configuration menu
    Copy the full SHA
    f0acf7b View commit details
    Browse the repository at this point in the history
  2. Add missing key to TFLayoutLM signature (#28640)

    Fix missing bbox in LayoutLM signature
    Rocketknight1 authored Jan 22, 2024
    Configuration menu
    Copy the full SHA
    bf67415 View commit details
    Browse the repository at this point in the history
  3. Avoid root logger's level being changed (#28638)

    * avoid root logger's level being changed
    
    ---------
    
    Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
    ydshieh and ydshieh authored Jan 22, 2024
    Configuration menu
    Copy the full SHA
    d336c56 View commit details
    Browse the repository at this point in the history
  4. Add config tip to custom model docs (#28601)

    Add tip to custom model docs
    Rocketknight1 authored Jan 22, 2024
    Configuration menu
    Copy the full SHA
    692c3c6 View commit details
    Browse the repository at this point in the history
  5. Fix lr_scheduler in no_trainer training scripts (#27872)

    * Fix lr_scheduler
    
    * Fix lr scheduler
    bofenghuang authored Jan 22, 2024
    Configuration menu
    Copy the full SHA
    deb2b59 View commit details
    Browse the repository at this point in the history
  6. [Llava] Update convert_llava_weights_to_hf.py script (#28617)

    * Update convert_llava_weights_to_hf.py script
    
    * Remove config update of adding padding to `vocab_size` and `text_config.vocab_size` which causes `ValueError` exception.
    * Remove keys that ends with `inv_freq` from the state dict.
    * Add examples and instructions for creating `model_state_dict.bin` that can be used by the script.
    
    * Update convert_llava_weights_to_hf.py
    
    * Update convert_vipllava_weights_to_hf.py
    isaac-vidas authored Jan 22, 2024
    Configuration menu
    Copy the full SHA
    dafd595 View commit details
    Browse the repository at this point in the history
  7. [GPTNeoX] Fix GPTNeoX + Flash Attention 2 issue (#28645)

    Update modeling_gpt_neox.py
    younesbelkada authored Jan 22, 2024
    Configuration menu
    Copy the full SHA
    e201864 View commit details
    Browse the repository at this point in the history
  8. Update image_processing_deformable_detr.py (#28561)

    * Update image_processing_deformable_detr.py
    
    * Changes after running make fix-copies
    sounakdey authored Jan 22, 2024
    Configuration menu
    Copy the full SHA
    a35ea57 View commit details
    Browse the repository at this point in the history
  9. [SigLIP] Only import tokenizer if sentencepiece available (#28636)

    Only import class if sp available
    amyeroberts authored Jan 22, 2024
    Configuration menu
    Copy the full SHA
    590be77 View commit details
    Browse the repository at this point in the history
  10. Fix phi model doc checkpoint (#28581)

    Co-authored-by: Pashmina Cameron <11311835+pashminacameron@users.noreply.github.com>
    amyeroberts and pashminacameron authored Jan 22, 2024
    Configuration menu
    Copy the full SHA
    e547458 View commit details
    Browse the repository at this point in the history

Commits on Jan 23, 2024

  1. get default device through PartialState().default_device as it has …

    …been officially released (#27256)
    
    get default device through `PartialState().default_device` as it has
    been officially released
    statelesshz authored Jan 23, 2024
    Configuration menu
    Copy the full SHA
    1fc1296 View commit details
    Browse the repository at this point in the history
  2. integrations: fix DVCLiveCallback model logging (#28653)

    Dave Berenbaum authored Jan 23, 2024
    Configuration menu
    Copy the full SHA
    0398660 View commit details
    Browse the repository at this point in the history
  3. Enable safetensors conversion from PyTorch to other frameworks withou…

    …t the torch requirement (#27599)
    
    * Initial commit
    
    * Requirements & tests
    
    * Tests
    
    * Tests
    
    * Rogue import
    
    * Rogue torch import
    
    * Cleanup
    
    * Apply suggestions from code review
    
    Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com>
    
    * bfloat16 management
    
    * Sanchit's comments
    
    * Import shield
    
    * apply suggestions from code review
    
    * correct bf16
    
    * rebase
    
    ---------
    
    Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com>
    Co-authored-by: sanchit-gandhi <sanchit@huggingface.co>
    3 people authored Jan 23, 2024
    Configuration menu
    Copy the full SHA
    008a6a2 View commit details
    Browse the repository at this point in the history
  4. Enable instantiating model with pretrained backbone weights (#28214)

    * Enable instantiating model with pretrained backbone weights
    
    * Update tests so backbone checkpoint isn't passed in
    
    * Remove doc updates until changes made in modeling code
    
    * Clarify pretrained import
    
    * Update configs - docs and validation check
    
    * Update src/transformers/utils/backbone_utils.py
    
    Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
    
    * Clarify exception message
    
    * Update config init in tests
    
    * Add test for when use_timm_backbone=True
    
    * Small test updates
    
    ---------
    
    Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
    amyeroberts and ArthurZucker authored Jan 23, 2024
    Configuration menu
    Copy the full SHA
    27c79a0 View commit details
    Browse the repository at this point in the history
  5. tensor_size - fix copy/paste error msg typo (#28660)

    Fix copy/paste error msg typo
    scruel authored Jan 23, 2024
    Configuration menu
    Copy the full SHA
    c475eca View commit details
    Browse the repository at this point in the history
  6. Fix windows err with checkpoint race conditions (#28637)

    Fix windows err
    muellerzr authored Jan 23, 2024
    Configuration menu
    Copy the full SHA
    582d104 View commit details
    Browse the repository at this point in the history
  7. add dataloader prefetch factor in training args and trainer (#28498)

    * add dataloader prefetch factor in training args and trainer
    
    * remove trailing spaces
    
    * prevent dataloader_num_workers == 0 and dataloader_prefetch_factor != None
    
    dataloader_prefetch_factor works only when data is loaded in a different process as the main one. This commit adds the necessary checks to avoid having prefetch_factor set when there is no such process.
    
    * Remove whitespaces in empty line
    
    * Update src/transformers/training_args.py
    
    Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
    
    * Update src/transformers/training_args.py
    
    Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
    
    * Update src/transformers/training_args.py
    
    Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
    
    * Update src/transformers/training_args.py
    
    Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
    
    ---------
    
    Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
    qmeeus and amyeroberts authored Jan 23, 2024
    Configuration menu
    Copy the full SHA
    5b5e71d View commit details
    Browse the repository at this point in the history
  8. Support single token decode for CodeGenTokenizer (#28628)

    convert token id to list in .decode()
    cmathw authored Jan 23, 2024
    Configuration menu
    Copy the full SHA
    9a4521d View commit details
    Browse the repository at this point in the history
  9. Remove deprecated eager_serving fn (#28665)

    * Remove deprecated eager_serving fn
    
    * Fix the input_signature docstring while I'm here
    Rocketknight1 authored Jan 23, 2024
    Configuration menu
    Copy the full SHA
    ebc8f47 View commit details
    Browse the repository at this point in the history
  10. fix a hidden bug of GenerationConfig, now the `generation_config.js…

    …on` can be loaded successfully (#28604)
    
    * fix a hidden bug of GenerationConfig
    
    * keep `sort_keys=True` to maintain visibility
    
    * Update src/transformers/generation/configuration_utils.py
    
    Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
    
    * Update configuration_utils.py
    
    in case `obj` is a list, check the items in the list
    
    ---------
    
    Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
    ParadoxZW and amyeroberts authored Jan 23, 2024
    Configuration menu
    Copy the full SHA
    39c3c0a View commit details
    Browse the repository at this point in the history
  11. Update README_es.md (#28612)

    Fixing grammatical errors in the text
    vladydev3 authored Jan 23, 2024
    Configuration menu
    Copy the full SHA
    5f81266 View commit details
    Browse the repository at this point in the history

Commits on Jan 24, 2024

  1. Exclude the load balancing loss of padding tokens in Mixtral-8x7B (#…

    …28517)
    
    * fix the function load_balancing_loss_func in Mixtral_Moe to include attention_mask
    
    * format code using black and ruff
    
    * skip computing mask if attention_mask=None
    
    * add tests for load balancing loss Mixtral-Moe
    
    * fix assert loss is different in mixtral_test
    
    * fix pad_leng
    
    * use assertNotAlmostEqual and print to debug
    
    * remove print for debug
    
    * minor updates
    
    * reduce rtol and atol
    khaimt authored Jan 24, 2024
    Configuration menu
    Copy the full SHA
    c5c6909 View commit details
    Browse the repository at this point in the history
  2. Use save_safetensor to disable safe serialization for XLA (#28669)

    * Use save_safetensor to disable safe serialization for XLA
    
    #28438
    
    * Style fixup
    jeffhataws authored Jan 24, 2024
    Configuration menu
    Copy the full SHA
    0549000 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    bb6aa8b View commit details
    Browse the repository at this point in the history
  4. [docs] DeepSpeed (#28542)

    * config
    
    * optim
    
    * pre deploy
    
    * deploy
    
    * save weights, memory, troubleshoot, non-Trainer
    
    * done
    stevhliu authored Jan 24, 2024
    Configuration menu
    Copy the full SHA
    738ec75 View commit details
    Browse the repository at this point in the history
  5. Improved type hinting for all attention parameters (#28479)

    * Changed type hinting for all attention inputs to 'Optional[Tuple[torch.FloatTensor,...]] = None'
    
    * Fixed the ruff formatting issue
    
    * fixed type hinting for all hidden_states to 'Optional[Tuple[torch.FloatTensor, ...]] = None'
    
    * Changed type hinting in these 12 scripts modeling_dpr.py,modeling_nat.py,idefics/vision.py,modeling_tf_dpr.py,modeling_luke.py,modeling_swin.py,modeling_tf_swin.py,modeling_blip.py,modeling_tf_blip.py,modeling_donut_swin.py,modeling_dinat.py,modeling_swinv2.py
    
    * test fail update
    
    * fixed type hinting for these 15 scripts modeling_xlnet.py,modeling_tf_xlnet.py,modeling_led.py,modeling_tf_led.py,modleing_rwkv.py,modeling_dpt.py,modeling_tf_cvt.py,modeling_clip.py,modeling_flax_clip.py,modeling_tf_clip.py,modeling_longformer.py,modeling_tf_longformer.py,modeling_siglip.py,modeling_clap.py,modeling_git.py
    
    * Changed type hinting in these 12 scripts modeling_dpr.py,modeling_nat.py,idefics/vision.py,modeling_tf_dpr.py,modeling_luke.py,modeling_swin.py,modeling_tf_swin.py,modeling_blip.py,modeling_tf_blip.py,modeling_donut_swin.py,modeling_dinat.py,modeling_swinv2.py
    
    * test fail update
    
    * Removed the myvenv file
    
    * Fixed type hinting for these 8 scripts modeling_tvlt.py,modeling_sam.py,modeling_tf_sam.py,modeling_tvp.py,modeling_rag.py,modeling_tf_rag.py,modeling_tf_xlm.py,modeling_xlm.py
    nakranivaibhav authored Jan 24, 2024
    Configuration menu
    Copy the full SHA
    5d29530 View commit details
    Browse the repository at this point in the history
  6. improve efficient training on CPU documentation (#28646)

    * update doc
    
    * revert
    
    * typo fix
    
    * refine
    
    * add dtypes
    
    * Update docs/source/en/perf_train_cpu.md
    
    Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
    
    * Update docs/source/en/perf_train_cpu.md
    
    Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
    
    * Update docs/source/en/perf_train_cpu.md
    
    Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
    
    * no comma
    
    * use avx512-vnni
    
    ---------
    
    Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
    faaany and stevhliu authored Jan 24, 2024
    Configuration menu
    Copy the full SHA
    8278b15 View commit details
    Browse the repository at this point in the history
  7. [docs] Fix doc format (#28684)

    * fix hfoptions
    
    * revert changes to other files
    
    * fix
    stevhliu authored Jan 24, 2024
    Configuration menu
    Copy the full SHA
    f40b87d View commit details
    Browse the repository at this point in the history

Commits on Jan 25, 2024

  1. Add Depth Anything (#28654)

    * First draft
    
    * More improvements
    
    * More improvements
    
    * More improvements
    
    * More improvements
    
    * Add docs
    
    * Remove file
    
    * Add copied from
    
    * Address comments
    
    * Address comments
    
    * Address comments
    
    * Fix style
    
    * Update docs
    
    * Convert all checkpoints, add integration test
    
    * Rename checkpoints
    
    * Add pretrained backbone attributes
    
    * Fix default config
    
    * Address comment
    
    * Add figure to docs
    
    * Fix bug thanks to @xenova
    
    * Update conversion script
    
    * Fix integration test
    NielsRogge authored Jan 25, 2024
    Configuration menu
    Copy the full SHA
    963db81 View commit details
    Browse the repository at this point in the history
  2. [chore] Add missing space in warning (#28695)

    Add missing space in warning
    tomaarsen authored Jan 25, 2024
    Configuration menu
    Copy the full SHA
    7fa4b36 View commit details
    Browse the repository at this point in the history
  3. Improve Backbone API docs (#28666)

    Update backbones.md
    merveenoyan authored Jan 25, 2024
    Configuration menu
    Copy the full SHA
    2000095 View commit details
    Browse the repository at this point in the history
  4. Update question_answering.md (#28694)

    fix typo:
    
    from:
    
     "model = TFAutoModelForQuestionAnswering("distilbert-base-uncased")"
    
    to:
    model = TFAutoModelForQuestionAnswering.from_pretrained("distilbert-base-uncased")
    yusyel authored Jan 25, 2024
    Configuration menu
    Copy the full SHA
    24f1a00 View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    4cbd876 View commit details
    Browse the repository at this point in the history
  6. [docs] Improve visualization for vertical parallelism (#28583)

    The documentation says "We refer to this Model parallelism as “Vertical” because of how models are typically visualized.", but then visualizes the model horizontally. This change visualizes the model indeed vertically.
    petergtz authored Jan 25, 2024
    Configuration menu
    Copy the full SHA
    2875195 View commit details
    Browse the repository at this point in the history

Commits on Jan 26, 2024

  1. Don't fail when LocalEntryNotFoundError during `processor_config.js…

    …on` loading (#28709)
    
    * fix
    
    ---------
    
    Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
    ydshieh and ydshieh authored Jan 26, 2024
    Configuration menu
    Copy the full SHA
    142ce68 View commit details
    Browse the repository at this point in the history
  2. Fix duplicate & unnecessary flash attention warnings (#28557)

    * fix duplicate & unnecessary flash warnings
    
    * trigger ci
    
    * warning_once
    
    * if/else order
    
    ---------
    
    Co-authored-by: Your Name <you@example.com>
    fxmarty and Your Name authored Jan 26, 2024
    Configuration menu
    Copy the full SHA
    8eb74c1 View commit details
    Browse the repository at this point in the history
  3. support PeftMixedModel signature inspect (#28321)

    * support PeftMixedModel signature inspect
    
    * import PeftMixedModel only peft>=0.7.0
    
    * Update src/transformers/trainer.py
    
    Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
    
    * Update src/transformers/trainer.py
    
    Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
    
    * Update src/transformers/trainer.py
    
    Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
    
    * Update src/transformers/trainer.py
    
    Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
    
    * Update src/transformers/trainer.py
    
    Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
    
    * Update src/transformers/trainer.py
    
    Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
    
    * fix styling
    
    * Update src/transformers/trainer.py
    
    Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
    
    * Update src/transformers/trainer.py
    
    Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
    
    * style fixup
    
    * fix note
    
    ---------
    
    Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
    Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
    3 people authored Jan 26, 2024
    Configuration menu
    Copy the full SHA
    bbe30c6 View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    1f47a24 View commit details
    Browse the repository at this point in the history
  5. [docs] Update preprocessing.md (#28719)

    * Update preprocessing.md
    
    adjust ImageProcessor link to working target (same as in lower section of file)
    
    * Update preprocessing.md
    velaia authored Jan 26, 2024
    Configuration menu
    Copy the full SHA
    3a46e30 View commit details
    Browse the repository at this point in the history
  6. Initialize _tqdm_active with hf_hub_utils.are_progress_bars_disabled(… (

    #28717)
    
    Initialize _tqdm_active with hf_hub_utils.are_progress_bars_disabled() to respect HF_HUB_DISABLE_PROGRESS_BARS
    
    It seems like enable_progress_bar() and disable_progress_bar() sync up with huggingface_hub, but the initial value is always True. This changes will make sure the user's preference is respected implicity on initialization.
    ShukantPal authored Jan 26, 2024
    Configuration menu
    Copy the full SHA
    d6ac8f4 View commit details
    Browse the repository at this point in the history
  7. Fix weights_only (#28725)

    fix
    
    Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
    ydshieh and ydshieh authored Jan 26, 2024
    Configuration menu
    Copy the full SHA
    a638de1 View commit details
    Browse the repository at this point in the history
  8. Stop confusing the TF compiler with ModelOutput objects (#28712)

    * Stop confusing the TF compiler with ModelOutput objects
    
    * Stop confusing the TF compiler with ModelOutput objects
    Rocketknight1 authored Jan 26, 2024
    Configuration menu
    Copy the full SHA
    708b19e View commit details
    Browse the repository at this point in the history
  9. fix: suppress GatedRepoError to use cache file (fix #28558). (#28566)

    * fix: suppress `GatedRepoError` to use cache file (fix #28558).
    
    * move condition_to_return parameter back to outside.
    scruel authored Jan 26, 2024
    Configuration menu
    Copy the full SHA
    3aea38c View commit details
    Browse the repository at this point in the history
  10. Unpin pydantic (#28728)

    * try pydantic v2
    
    * try pydantic v2
    
    ---------
    
    Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
    ydshieh and ydshieh authored Jan 26, 2024
    Configuration menu
    Copy the full SHA
    f8b7c43 View commit details
    Browse the repository at this point in the history
  11. [docs] Fix datasets in guides (#28715)

    * change datasets
    
    * fix
    stevhliu authored Jan 26, 2024
    Configuration menu
    Copy the full SHA
    abe0289 View commit details
    Browse the repository at this point in the history
  12. Configuration menu
    Copy the full SHA
    de13a95 View commit details
    Browse the repository at this point in the history

Commits on Jan 27, 2024

  1. Configuration menu
    Copy the full SHA
    a28a769 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    03cc177 View commit details
    Browse the repository at this point in the history

Commits on Jan 28, 2024

  1. [Siglip] protect from imports if sentencepiece not installed (#28737)

    [Siglip] protect from imports if sentencepiece not installed
    amyeroberts authored Jan 28, 2024
    Configuration menu
    Copy the full SHA
    f1cc615 View commit details
    Browse the repository at this point in the history

Commits on Jan 29, 2024

  1. Add serialization logic to pytree types (#27871)

    * Add serialized type name to pytrees
    
    * Modify context
    
    * add serde test
    angelayi authored Jan 29, 2024
    Configuration menu
    Copy the full SHA
    243e186 View commit details
    Browse the repository at this point in the history
  2. Fix DepthEstimationPipeline's docstring (#28733)

    * fix
    
    * fix
    
    * Fix
    
    ---------
    
    Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
    ydshieh and ydshieh authored Jan 29, 2024
    Configuration menu
    Copy the full SHA
    5649c0c View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    39fa400 View commit details
    Browse the repository at this point in the history
  4. [Docs] Fix Typo in English & Japanese CLIP Model Documentation (TMBD …

    …-> TMDB) (#28751)
    
    * [Docs] Fix Typo in English CLIP model_doc
    
    * [Docs] Fix Typo in Japanese CLIP model_doc
    Vinyzu authored Jan 29, 2024
    Configuration menu
    Copy the full SHA
    3a08cc4 View commit details
    Browse the repository at this point in the history
  5. PatchtTST and PatchTSMixer fixes (#28083)

    * 🐛 fix .max bug
    
    * remove prediction_length from regression output dimensions
    
    * fix parameter names, fix output names, update tests
    
    * ensure shape for PatchTST
    
    * ensure output shape for PatchTSMixer
    
    * update model, batch, and expected for regression distribution test
    
    * update test expected
    
    Signed-off-by: Wesley M. Gifford <wmgifford@us.ibm.com>
    
    * Update tests/models/patchtst/test_modeling_patchtst.py
    
    Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
    
    * Update tests/models/patchtst/test_modeling_patchtst.py
    
    Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
    
    * Update tests/models/patchtst/test_modeling_patchtst.py
    
    Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
    
    * Update src/transformers/models/patchtsmixer/modeling_patchtsmixer.py
    
    Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
    
    * Update tests/models/patchtsmixer/test_modeling_patchtsmixer.py
    
    Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
    
    * Update tests/models/patchtsmixer/test_modeling_patchtsmixer.py
    
    Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
    
    * standardize on patch_length
    
    Signed-off-by: Wesley M. Gifford <wmgifford@us.ibm.com>
    
    * Update tests/models/patchtsmixer/test_modeling_patchtsmixer.py
    
    Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
    
    * Update tests/models/patchtsmixer/test_modeling_patchtsmixer.py
    
    Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
    
    * Make arguments more explicit
    
    Signed-off-by: Wesley M. Gifford <wmgifford@us.ibm.com>
    
    * adjust prepared inputs
    
    Signed-off-by: Wesley M. Gifford <wmgifford@us.ibm.com>
    
    ---------
    
    Signed-off-by: Wesley M. Gifford <wmgifford@us.ibm.com>
    Co-authored-by: Wesley M. Gifford <wmgifford@us.ibm.com>
    Co-authored-by: Kashif Rasul <kashif.rasul@gmail.com>
    Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
    4 people authored Jan 29, 2024
    Configuration menu
    Copy the full SHA
    f72c7c2 View commit details
    Browse the repository at this point in the history
  6. Enable Gradient Checkpointing in Deformable DETR (#28686)

    * Enabled gradient checkpointing in Deformable DETR
    
    * Enabled gradient checkpointing in Deformable DETR encoder
    
    * Removed # Copied from headers in modeling_deta.py to break dependence on Deformable DETR code
    FoamoftheSea authored Jan 29, 2024
    Configuration menu
    Copy the full SHA
    0548af5 View commit details
    Browse the repository at this point in the history
  7. Configuration menu
    Copy the full SHA
    26aa03a View commit details
    Browse the repository at this point in the history
  8. Pin pytest version <8.0.0 (#28758)

    * Pin pytest version <8.0.0
    
    * Update setup.py
    
    * make deps_table_update
    amyeroberts authored Jan 29, 2024
    Configuration menu
    Copy the full SHA
    0f8d015 View commit details
    Browse the repository at this point in the history
  9. Mark test_constrained_beam_search_generate as flaky (#28757)

    * Make test_constrained_beam_search_generate as flaky
    
    * Update tests/generation/test_utils.py
    amyeroberts authored Jan 29, 2024
    Configuration menu
    Copy the full SHA
    9e8f35f View commit details
    Browse the repository at this point in the history
  10. Fix typo of Block. (#28727)

    xkszltl authored Jan 29, 2024
    Configuration menu
    Copy the full SHA
    e694e98 View commit details
    Browse the repository at this point in the history
  11. [Whisper] Make tokenizer normalization public (#28136)

    * [Whisper] Make tokenizer normalization public
    
    * add to docs
    sanchit-gandhi authored Jan 29, 2024
    Configuration menu
    Copy the full SHA
    da3c79b View commit details
    Browse the repository at this point in the history
  12. Support saving only PEFT adapter in checkpoints when using PEFT + FSDP (

    #28297)
    
    * Update trainer.py
    
    * Revert "Update trainer.py"
    
    This reverts commit 0557e2c.
    
    * Make trainer.py use adapter_only=True when using FSDP + PEFT
    
    * Support load_best_model with adapter_only=True
    
    * Ruff format
    
    * Inspect function args for save_ load_ fsdp utility functions and only pass adapter_only=True if they support it
    AjayP13 authored Jan 29, 2024
    Configuration menu
    Copy the full SHA
    a055d09 View commit details
    Browse the repository at this point in the history
  13. Add French translation: french README.md (#28696)

    * doc: french README
    
    Signed-off-by: ThibaultLengagne <thibaultl@padok.fr>
    
    * doc: Add Depth Anything
    
    Signed-off-by: ThibaultLengagne <thibaultl@padok.fr>
    
    * doc: Add french link in other docs
    
    Signed-off-by: ThibaultLengagne <thibaultl@padok.fr>
    
    * doc: Add missing links in fr docs
    
    * doc: fix several mistakes in translation
    
    Signed-off-by: ThibaultLengagne <thibaultl@padok.fr>
    
    ---------
    
    Signed-off-by: ThibaultLengagne <thibaultl@padok.fr>
    Co-authored-by: Sarapuce <alexandreh@padok.fr>
    ThibaultLengagne and Sarapuce authored Jan 29, 2024
    Configuration menu
    Copy the full SHA
    cd2eb8c View commit details
    Browse the repository at this point in the history

Commits on Jan 30, 2024

  1. Don't allow passing load_in_8bit and load_in_4bit at the same time (

    #28266)
    
    * Update quantization_config.py
    
    * Style
    
    * Protect from setting directly
    
    * add tests
    
    * Update tests/quantization/bnb/test_4bit.py
    
    Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
    
    ---------
    
    Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
    osanseviero and younesbelkada authored Jan 30, 2024
    Configuration menu
    Copy the full SHA
    a989c6c View commit details
    Browse the repository at this point in the history
  2. Move CLIP _no_split_modules to CLIPPreTrainedModel (#27841)

    Add _no_split_modules to CLIPModel
    lz1oceani authored Jan 30, 2024
    Configuration menu
    Copy the full SHA
    1f5590d View commit details
    Browse the repository at this point in the history
  3. HfQuantizer class for quantization-related stuff in `modeling_utils…

    ….py` (#26610)
    
    * squashed earlier commits for easier rebase
    
    * rm rebase leftovers
    
    * 4bit save enabled @quantizers
    
    * TMP gptq test use exllama
    
    * fix AwqConfigTest::test_wrong_backend for A100
    
    * quantizers AWQ fixes
    
    * _load_pretrained_model low_cpu_mem_usage branch
    
    * quantizers style
    
    * remove require_low_cpu_mem_usage attr
    
    * rm dtype arg from process_model_before_weight_loading
    
    * rm config_origin from Q-config
    
    * rm inspect from q_config
    
    * fixed docstrings in QuantizationConfigParser
    
    * logger.warning fix
    
    * mv is_loaded_in_4(8)bit to BnbHFQuantizer
    
    * is_accelerate_available error msg fix in quantizer
    
    * split is_model_trainable in bnb quantizer class
    
    * rm llm_int8_skip_modules as separate var in Q
    
    * Q rm todo
    
    * fwd ref to HFQuantizer in type hint
    
    * rm note re optimum.gptq.GPTQQuantizer
    
    * quantization_config in __init__ simplified
    
    * replaced NonImplemented with  create_quantized_param
    
    * rm load_in_4/8_bit deprecation warning
    
    * QuantizationConfigParser refactoring
    
    * awq-related minor changes
    
    * awq-related changes
    
    * awq config.modules_to_not_convert
    
    * raise error if no q-method in q-config in args
    
    * minor cleanup
    
    * awq quantizer docstring
    
    * combine common parts in bnb process_model_before_weight_loading
    
    * revert test_gptq
    
    * .process_model_ cleanup
    
    * restore dict config warning
    
    * removed typevars in quantizers.py
    
    * cleanup post-rebase 16 jan
    
    * QuantizationConfigParser classmethod refactor
    
    * rework of handling of unexpected aux elements of bnb weights
    
    * moved q-related stuff from save_pretrained to quantizers
    
    * refactor v1
    
    * more changes
    
    * fix some tests
    
    * remove it from main init
    
    * ooops
    
    * Apply suggestions from code review
    
    Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
    
    * fix awq issues
    
    * fix
    
    * fix
    
    * fix
    
    * fix
    
    * fix
    
    * fix
    
    * add docs
    
    * Apply suggestions from code review
    
    Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
    Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
    
    * Apply suggestions from code review
    
    Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
    
    * Update docs/source/en/hf_quantizer.md
    
    * address comments
    
    * fix
    
    * fixup
    
    * Update src/transformers/modeling_utils.py
    
    Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
    
    * Update src/transformers/modeling_utils.py
    
    Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
    
    * address final comment
    
    * update
    
    * Update src/transformers/quantizers/base.py
    
    Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
    
    * Update src/transformers/quantizers/auto.py
    
    Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
    
    * fix
    
    * add kwargs update
    
    * fixup
    
    * add `optimum_quantizer` attribute
    
    * oops
    
    * rm unneeded file
    
    * fix doctests
    
    ---------
    
    Co-authored-by: younesbelkada <younesbelkada@gmail.com>
    Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
    Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
    Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
    Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
    6 people authored Jan 30, 2024
    Configuration menu
    Copy the full SHA
    d78e78a View commit details
    Browse the repository at this point in the history
  4. [HfQuantizer] Move it to "Developper guides" (#28768)

    Update _toctree.yml
    younesbelkada authored Jan 30, 2024
    Configuration menu
    Copy the full SHA
    866253f View commit details
    Browse the repository at this point in the history
  5. Use Conv1d for TDNN (#25728)

    * use conv for tdnn
    
    * run make fixup
    
    * update TDNN
    
    * add PEFT LoRA check
    
    * propagate tdnn warnings to others
    
    * add missing imports
    
    * update TDNN in wav2vec2_bert
    
    * add missing imports
    gau-nernst authored Jan 30, 2024
    Configuration menu
    Copy the full SHA
    5c8d941 View commit details
    Browse the repository at this point in the history
  6. Fix transformers.utils.fx compatibility with torch<2.0 (#28774)

    guard sdpa on torch>=2.0
    fxmarty authored Jan 30, 2024
    Configuration menu
    Copy the full SHA
    6f7d5db View commit details
    Browse the repository at this point in the history
  7. Further pin pytest version (in a temporary way) (#28780)

    fix
    
    Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
    ydshieh and ydshieh authored Jan 30, 2024
    Configuration menu
    Copy the full SHA
    c24c524 View commit details
    Browse the repository at this point in the history
  8. [Backbone] Use load_backbone instead of AutoBackbone.from_config (

    #28661)
    
    * Enable instantiating model with pretrained backbone weights
    
    * Remove doc updates until changes made in modeling code
    
    * Use load_backbone instead
    
    * Add use_timm_backbone to the model configs
    
    * Add missing imports and arguments
    
    * Update docstrings
    
    * Make sure test is properly configured
    
    * Include recent DPT updates
    amyeroberts authored Jan 30, 2024
    Configuration menu
    Copy the full SHA
    2fa1c80 View commit details
    Browse the repository at this point in the history
  9. Task-specific pipeline init args (#28439)

    * Abstract out pipeline init args
    
    * Address PR comments
    
    * Reword
    
    * BC PIPELINE_INIT_ARGS
    
    * Remove old arguments
    
    * Small fix
    amyeroberts authored Jan 30, 2024
    Configuration menu
    Copy the full SHA
    1d489b3 View commit details
    Browse the repository at this point in the history
  10. Add tf_keras imports to prepare for Keras 3 (#28588)

    * Port core files + ESM (because ESM code is odd)
    
    * Search-replace in modelling code
    
    * Fix up transfo_xl as well
    
    * Fix other core files + tests (still need to add correct import to tests)
    
    * Fix cookiecutter
    
    * make fixup, fix imports in some more core files
    
    * Auto-add imports to tests
    
    * Cleanup, add imports to sagemaker tests
    
    * Use correct exception for importing tf_keras
    
    * Fixes in modeling_tf_utils
    
    * make fixup
    
    * Correct version parsing code
    
    * Ensure the pipeline tests correctly revert to float32 after each test
    
    * Ensure the pipeline tests correctly revert to float32 after each test
    
    * More tf.keras -> keras
    
    * Add dtype cast
    
    * Better imports of tf_keras
    
    * Add a cast for tf.assign, just in case
    
    * Fix callback imports
    Rocketknight1 authored Jan 30, 2024
    Configuration menu
    Copy the full SHA
    415e9a0 View commit details
    Browse the repository at this point in the history
  11. Pin Torch to <2.2.0 (#28785)

    * Pin torch to <2.2.0
    
    * Pin torchvision and torchaudio as well
    
    * Playing around with versions to see if this helps
    
    * twiddle something to restart the CI
    
    * twiddle it back
    
    * Try changing the natten version
    
    * make fixup
    
    * Revert "Try changing the natten version"
    
    This reverts commit de0d659.
    
    * make fixup
    
    * fix fix fix
    
    * fix fix fix
    
    ---------
    
    Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
    Rocketknight1 and ydshieh authored Jan 30, 2024
    Configuration menu
    Copy the full SHA
    74c9cfe View commit details
    Browse the repository at this point in the history

Commits on Jan 31, 2024

  1. [bnb] Fix bnb slow tests (#28788)

    fix bnb slow tests
    younesbelkada authored Jan 31, 2024
    Configuration menu
    Copy the full SHA
    d703eaa View commit details
    Browse the repository at this point in the history
  2. Prevent MLflow exception from disrupting training (#28779)

    Modified MLflow logging metrics from synchronous to asynchronous
    
    Co-authored-by: codiceSpaghetti <alessio.ser@hotmail.it>
    codiceSpaghetti and codiceSpaghetti authored Jan 31, 2024
    Configuration menu
    Copy the full SHA
    a937425 View commit details
    Browse the repository at this point in the history
  3. don't initialize the output embeddings if we're going to tie them to …

    …input embeddings (#28192)
    
    * test that tied output embeddings aren't initialized on load
    
    * don't initialize the output embeddings if we're going to tie them to the input embeddings
    tom-p-reichel authored Jan 31, 2024
    Configuration menu
    Copy the full SHA
    ae0c27a View commit details
    Browse the repository at this point in the history
  4. [HFQuantizer] Remove check_packages_compatibility logic (#28789)

    remove `check_packages_compatibility` logic
    younesbelkada authored Jan 31, 2024
    Configuration menu
    Copy the full SHA
    f9f1f2a View commit details
    Browse the repository at this point in the history
  5. [Whisper] Refactor forced_decoder_ids & prompt ids (#28687)

    * up
    
    * Fix more
    
    * Correct more
    
    * Fix more tests
    
    * fix fast tests
    
    * Fix more
    
    * fix more
    
    * push all files
    
    * finish all
    
    * make style
    
    * Fix timestamp wrap
    
    * make style
    
    * make style
    
    * up
    
    * up
    
    * up
    
    * Fix lang detection behavior
    
    * Fix lang detection behavior
    
    * Add lang detection test
    
    * Fix lang detection behavior
    
    * make style
    
    * Update src/transformers/models/whisper/generation_whisper.py
    
    Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
    
    * better error message
    
    * make style tests
    
    * add warning
    
    ---------
    
    Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
    patrickvonplaten and sanchit-gandhi authored Jan 31, 2024
    Configuration menu
    Copy the full SHA
    65a926e View commit details
    Browse the repository at this point in the history
  6. Resolve DeepSpeed cannot resume training with PeftModel (#28746)

    * fix: resolve deepspeed resume peft model issues
    
    * chore: update something
    
    * chore: update model instance pass into is peft model checks
    
    * chore: remove hard code value to tests
    
    * fix: format code
    lh0x00 authored Jan 31, 2024
    Configuration menu
    Copy the full SHA
    bebeeee View commit details
    Browse the repository at this point in the history
  7. canonical repos moves (#28795)

    * canonical repos moves
    
    * Style
    
    ---------
    
    Co-authored-by: Lysandre <lysandre@huggingface.co>
    julien-c and LysandreJik authored Jan 31, 2024
    Configuration menu
    Copy the full SHA
    721e2d9 View commit details
    Browse the repository at this point in the history
  8. Wrap Keras methods to support BatchEncoding (#28734)

    * Shim the Keras methods to support BatchEncoding
    
    * Extract everything to a convert_batch_encoding function
    
    * Convert BatchFeature too (thanks Amy)
    
    * tf.keras -> keras
    Rocketknight1 authored Jan 31, 2024
    Configuration menu
    Copy the full SHA
    7a49610 View commit details
    Browse the repository at this point in the history
  9. Flax mistral (#26943)

    * direct copy from llama work
    
    * mistral modules forward pass working
    
    * flax mistral forward pass with sliding window
    
    * added tests
    
    * added layer collection approach
    
    * Revert "added layer collection approach"
    
    This reverts commit 0e2905b.
    
    * Revert "Revert "added layer collection approach""
    
    This reverts commit fb17b61.
    
    * fixed attention outputs
    
    * added mistral to init and auto
    
    * fixed import name
    
    * fixed layernorm weight dtype
    
    * freeze initialized weights
    
    * make sure conversion consideres bfloat16
    
    * added backend
    
    * added docstrings
    
    * added cache
    
    * fixed sliding window causal mask
    
    * passes cache tests
    
    * passed all tests
    
    * applied make style
    
    * removed commented out code
    
    * applied fix-copies ignored other model changes
    
    * applied make fix-copies
    
    * removed unused functions
    
    * passed generation integration test
    
    * slow tests pass
    
    * fixed slow tests
    
    * changed default dtype from jax.numpy.float32 to float32 for docstring check
    
    * skip cache test  for FlaxMistralForSequenceClassification since if pad_token_id in input_ids it doesn't score previous input_ids
    
    * updated checkpoint since from_pt not included
    
    * applied black style
    
    * removed unused args
    
    * Applied styling and fixup
    
    * changed checkpoint for doc back
    
    * fixed rf after adding it to hf hub
    
    * Add dummy ckpt
    
    * applied styling
    
    * added tokenizer to new ckpt
    
    * fixed slice format
    
    * fix init and slice
    
    * changed ref for placeholder TODO
    
    * added copies from Llama
    
    * applied styling
    
    * applied fix-copies
    
    * fixed docs
    
    * update weight dtype reconversion for sharded weights
    
    * removed Nullable input ids
    
    * Removed unnecessary output attentions in Module
    
    * added embedding weight initialziation
    
    * removed unused past_key_values
    
    * fixed deterministic
    
    * Fixed RMS Norm and added copied from
    
    * removed input_embeds
    
    * applied make style
    
    * removed nullable input ids from sequence classification model
    
    * added copied from GPTJ
    
    * added copied from Llama on FlaxMistralDecoderLayer
    
    * added copied from to FlaxMistralPreTrainedModel methods
    
    * fix test deprecation warning
    
    * freeze gpt neox random_params and fix copies
    
    * applied make style
    
    * fixed doc issue
    
    * skipped docstring test to allign # copied from
    
    * applied make style
    
    * removed FlaxMistralForSequenceClassification
    
    * removed unused padding_idx
    
    * removed more sequence classification
    
    * removed sequence classification
    
    * applied styling and consistency
    
    * added copied from in tests
    
    * removed sequence classification test logic
    
    * applied styling
    
    * applied make style
    
    * removed freeze and fixed copies
    
    * undo test change
    
    * changed repeat_kv to tile
    
    * fixed to key value groups
    
    * updated copyright year
    
    * split casual_mask
    
    * empty to rerun failed pt_flax_equivalence test FlaxWav2Vec2ModelTest
    
    * went back to 2023 for tests_pr_documentation_tests
    
    * went back to 2024
    
    * changed tile to repeat
    
    * applied make style
    
    * empty for retry on Wav2Vec2
    kiansierra authored Jan 31, 2024
    Configuration menu
    Copy the full SHA
    f7076cd View commit details
    Browse the repository at this point in the history
  10. DeepSpeed: hardcode torch.arange dtype on float usage to avoid in…

    …correct initialization (#28760)
    gante authored Jan 31, 2024
    Configuration menu
    Copy the full SHA
    beb2a09 View commit details
    Browse the repository at this point in the history
  11. Add artifact name in job step to maintain job / artifact corresponden…

    …ce (#28682)
    
    * avoid using job name
    
    * apply to other files
    
    ---------
    
    Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
    ydshieh and ydshieh authored Jan 31, 2024
    Configuration menu
    Copy the full SHA
    95346e9 View commit details
    Browse the repository at this point in the history
  12. Split daily CI using 2 level matrix (#28773)

    * update / add new workflow files
    
    * Add comment
    
    * Use env.NUM_SLICES
    
    * use scripts
    
    * use scripts
    
    * use scripts
    
    * Fix
    
    * using one script
    
    * Fix
    
    * remove unused file
    
    * update
    
    * fail-fast: false
    
    * remove unused file
    
    * fix
    
    * fix
    
    * use matrix
    
    * inputs
    
    * style
    
    * update
    
    * fix
    
    * fix
    
    * no model name
    
    * add doc
    
    * allow args
    
    * style
    
    * pass argument
    
    ---------
    
    Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
    ydshieh and ydshieh authored Jan 31, 2024
    Configuration menu
    Copy the full SHA
    4735866 View commit details
    Browse the repository at this point in the history
  13. [docs] Correct the statement in the docstirng of compute_transition_s…

    …cores in generation/utils.py (#28786)
    Ki-Seki authored Jan 31, 2024
    Configuration menu
    Copy the full SHA
    7b2bd1f View commit details
    Browse the repository at this point in the history

Commits on Feb 1, 2024

  1. Adding [T5/MT5/UMT5]ForTokenClassification (#28443)

    * Adding [T5/MT5/UMT5]ForTokenClassification
    
    * Add auto mappings for T5ForTokenClassification and variants
    
    * Adding ForTokenClassification to the list of models
    
    * Adding attention_mask param to the T5ForTokenClassification test
    
    * Remove outdated comment in test
    
    * Adding EncoderOnly and Token Classification tests for MT5 and UMT5
    
    * Fix typo in umt5 string
    
    * Add tests for all the existing MT5 models
    
    * Fix wrong comment in dependency_versions_table
    
    * Reverting change to common test for _keys_to_ignore_on_load_missing
    
    The test is correctly picking up redundant keys in _keys_to_ignore_on_load_missing.
    
    * Removing _keys_to_ignore_on_missing from MT5 since the key is not used in the model
    
    * Add fix-copies to MT5ModelTest
    hackyon authored Feb 1, 2024
    Configuration menu
    Copy the full SHA
    0d26abd View commit details
    Browse the repository at this point in the history
  2. Make is_torch_bf16_available_on_device more strict (#28796)

    fix
    
    Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
    ydshieh and ydshieh authored Feb 1, 2024
    Configuration menu
    Copy the full SHA
    eb8e7a0 View commit details
    Browse the repository at this point in the history
  3. Fix symbolic_trace with kv cache (#28724)

    * fix symbolic_trace with kv cache
    
    * comment & better test
    fxmarty authored Feb 1, 2024
    Configuration menu
    Copy the full SHA
    709dc43 View commit details
    Browse the repository at this point in the history
  4. Add tip on setting tokenizer attributes (#28764)

    * Add tip on setting tokenizer attributes
    
    * Grammar
    
    * Remove the bit that was causing doc builds to fail
    Rocketknight1 authored Feb 1, 2024
    Configuration menu
    Copy the full SHA
    7bc6d76 View commit details
    Browse the repository at this point in the history
  5. enable graident checkpointing in DetaObjectDetection and add tests in…

    … Swin/Donut_Swin (#28615)
    
    * enable graident checkpointing in DetaObjectDetection
    
    * fix missing part in original DETA
    
    * make style
    
    * make fix-copies
    
    * Revert "make fix-copies"
    
    This reverts commit 4041c86.
    
    * remove fix-copies of DetaDecoder
    
    * enable swin gradient checkpointing
    
    * fix gradient checkpointing in donut_swin
    
    * add tests for deta/swin/donut
    
    * Revert "fix gradient checkpointing in donut_swin"
    
    This reverts commit 1cf345e.
    
    * change supports_gradient_checkpointing pipeline to PreTrainedModel
    
    * Revert "add tests for deta/swin/donut"
    
    This reverts commit 6056ffb.
    
    * Revert "Revert "fix gradient checkpointing in donut_swin""
    
    This reverts commit 24e25d0.
    
    * Simple revert
    
    * enable deformable detr gradient checkpointing
    
    * add gradient in encoder
    SangbumChoi authored Feb 1, 2024
    Configuration menu
    Copy the full SHA
    e19c12e View commit details
    Browse the repository at this point in the history
  6. [docs] fix some bugs about parameter description (#28806)

    Co-authored-by: p_spozzhang <p_spozzhang@tencent.com>
    zspo and p_spozzhang authored Feb 1, 2024
    Configuration menu
    Copy the full SHA
    d98591a View commit details
    Browse the repository at this point in the history
  7. Add models from deit (#28302)

    * Add modelss
    
    * Add 2 more models
    
    * add models to tocrree
    
    * Add modles
    
    * Update docs/source/ja/model_doc/detr.md
    
    Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
    
    * Update docs/source/ja/model_doc/deit.md
    
    Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
    
    * Update docs/source/ja/model_doc/deplot.md
    
    Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
    
    * fix bugs
    
    ---------
    
    Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
    rajveer43 and stevhliu authored Feb 1, 2024
    Configuration menu
    Copy the full SHA
    23ea674 View commit details
    Browse the repository at this point in the history
  8. [docs] Backbone (#28739)

    * backbones
    
    * fix path
    
    * fix paths
    
    * fix code snippet
    
    * fix links
    stevhliu authored Feb 1, 2024
    Configuration menu
    Copy the full SHA
    abbffc4 View commit details
    Browse the repository at this point in the history

Commits on Feb 2, 2024

  1. [docs] HfQuantizer (#28820)

    * tidy
    
    * fix path
    stevhliu authored Feb 2, 2024
    Configuration menu
    Copy the full SHA
    2418c64 View commit details
    Browse the repository at this point in the history
  2. [Docs] Fix spelling and grammar mistakes (#28825)

    * Fix typos and grammar mistakes in docs and examples
    
    * Fix typos in docstrings and comments
    
    * Fix spelling of `tokenizer` in model tests
    
    * Remove erroneous spaces in decorators
    
    * Remove extra spaces in Markdown link texts
    khipp authored Feb 2, 2024
    Configuration menu
    Copy the full SHA
    721ee78 View commit details
    Browse the repository at this point in the history
  3. Explicitly check if token ID's are None in TFBertTokenizer constructor (

    #28824)
    
    Add an explicit none-check, since token ids can be 0
    skumar951 authored Feb 2, 2024
    Configuration menu
    Copy the full SHA
    1efb21c View commit details
    Browse the repository at this point in the history
  4. Add missing None check for hf_quantizer (#28804)

    * Add missing None check for hf_quantizer
    
    * Add test, fix logic.
    
    * make style
    
    * Switch test model to Mistral
    
    * Comment
    
    * Update tests/test_modeling_utils.py
    
    ---------
    
    Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
    jganitkevitch and younesbelkada authored Feb 2, 2024
    Configuration menu
    Copy the full SHA
    ec29d25 View commit details
    Browse the repository at this point in the history
  5. Fix issues caused by natten (#28834)

    try
    
    Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
    ydshieh and ydshieh authored Feb 2, 2024
    Configuration menu
    Copy the full SHA
    0e75aee View commit details
    Browse the repository at this point in the history
  6. fix / skip (for now) some tests before switch to torch 2.2 (#28838)

    * fix / skip some tests before we can switch to torch 2.2
    
    * style
    
    ---------
    
    Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
    ydshieh and ydshieh authored Feb 2, 2024
    Configuration menu
    Copy the full SHA
    a7cb92a View commit details
    Browse the repository at this point in the history
  7. Use -v for pytest on CircleCI (#28840)

    use -v in pytest
    
    Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
    ydshieh and ydshieh authored Feb 2, 2024
    Configuration menu
    Copy the full SHA
    f497795 View commit details
    Browse the repository at this point in the history
  8. Reduce GPU memory usage when using FSDP+PEFT (#28830)

    support FSDP+PEFT
    pacman100 authored Feb 2, 2024
    Configuration menu
    Copy the full SHA
    80d5007 View commit details
    Browse the repository at this point in the history
  9. Mark test_encoder_decoder_model_generate for `vision_encoder_deocde…

    …r` as flaky (#28842)
    
    Mark test as flaky
    amyeroberts authored Feb 2, 2024
    Configuration menu
    Copy the full SHA
    3d2900e View commit details
    Browse the repository at this point in the history

Commits on Feb 5, 2024

  1. Bump dash from 2.3.0 to 2.15.0 in /examples/research_projects/decisio…

    …n_transformer (#28845)
    
    Bump dash in /examples/research_projects/decision_transformer
    
    Bumps [dash](https://github.com/plotly/dash) from 2.3.0 to 2.15.0.
    - [Release notes](https://github.com/plotly/dash/releases)
    - [Changelog](https://github.com/plotly/dash/blob/dev/CHANGELOG.md)
    - [Commits](plotly/dash@v2.3.0...v2.15.0)
    
    ---
    updated-dependencies:
    - dependency-name: dash
      dependency-type: direct:production
    ...
    
    Signed-off-by: dependabot[bot] <support@github.com>
    Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
    dependabot[bot] authored Feb 5, 2024
    Configuration menu
    Copy the full SHA
    ca8944c View commit details
    Browse the repository at this point in the history
  2. Support custom scheduler in deepspeed training (#26831)

    Reuse trainer.create_scheduler to create scheduler for deepspeed
    VeryLazyBoy authored Feb 5, 2024
    Configuration menu
    Copy the full SHA
    7b70283 View commit details
    Browse the repository at this point in the history
  3. [Docs] Fix bad doc: replace save with logging (#28855)

    Fix bad doc: replace save with logging
    chenzizhao authored Feb 5, 2024
    Configuration menu
    Copy the full SHA
    c430d6e View commit details
    Browse the repository at this point in the history
  4. Ability to override clean_code_for_run (#28783)

    * Add clean_code_for_run function
    
    * Call clean_code_for_run from agent method
    w4ffl35 authored Feb 5, 2024
    Configuration menu
    Copy the full SHA
    0466fd5 View commit details
    Browse the repository at this point in the history
  5. [WIP] Hard error when ignoring tensors. (#27484)

    * [WIP] Hard error when ignoring tensors.
    
    * Better selection/error when saving a checkpoint.
    
    - Find all names we should normally drop (those are in the transformers
      config)
    - Find all disjoint tensors (for those we can safely trigger a copy to
      get rid of the sharing before saving)
    - Clone those disjoint tensors getting rid of the issue
    - Find all identical names (those should be declared in the config
      but we try to find them all anyway.)
    - For all identical names:
      - If they are in the config, just ignore them everything is fine
      - If they are not, warn about them.
    - For all remainder tensors which are shared yet neither identical NOR
      disjoint. raise a hard error.
    
    * Adding a failing test on `main` that passes here.
    
    * We don't need to keep the subfolder logic in this test.
    
    * Apply suggestions from code review
    
    Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
    
    ---------
    
    Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
    Narsil and ArthurZucker authored Feb 5, 2024
    Configuration menu
    Copy the full SHA
    2da28c4 View commit details
    Browse the repository at this point in the history
  6. [Doc] update contribution guidelines (#28858)

    update guidelines
    ArthurZucker authored Feb 5, 2024
    Configuration menu
    Copy the full SHA
    3f9f749 View commit details
    Browse the repository at this point in the history
  7. Correct wav2vec2-bert inputs_to_logits_ratio (#28821)

    * Correct wav2vec2-bert inputs_to_logits_ratio
    
    * correct ratio
    
    * correct ratio, clean asr pipeline
    
    * refactor on one line
    ylacombe authored Feb 5, 2024
    Configuration menu
    Copy the full SHA
    7addc93 View commit details
    Browse the repository at this point in the history
  8. Image Feature Extraction pipeline (#28216)

    * Draft pipeline
    
    * Fixup
    
    * Fix docstrings
    
    * Update doctest
    
    * Update pipeline_model_mapping
    
    * Update docstring
    
    * Update tests
    
    * Update src/transformers/pipelines/image_feature_extraction.py
    
    Co-authored-by: Omar Sanseviero <osanseviero@gmail.com>
    
    * Fix docstrings - review comments
    
    * Remove pipeline mapping for composite vision models
    
    * Add to pipeline tests
    
    * Remove for flava (multimodal)
    
    * safe pil import
    
    * Add requirements for pipeline run
    
    * Account for super slow efficientnet
    
    * Review comments
    
    * Fix tests
    
    * Swap order of kwargs
    
    * Use build_pipeline_init_args
    
    * Add back FE pipeline for Vilt
    
    * Include image_processor_kwargs in docstring
    
    * Mark test as flaky
    
    * Update TODO
    
    * Update tests/pipelines/test_pipelines_image_feature_extraction.py
    
    Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
    
    * Add license header
    
    ---------
    
    Co-authored-by: Omar Sanseviero <osanseviero@gmail.com>
    Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
    3 people authored Feb 5, 2024
    Configuration menu
    Copy the full SHA
    ba3264b View commit details
    Browse the repository at this point in the history
  9. ClearMLCallback enhancements: support multiple runs and handle loggin…

    …g better (#28559)
    
    * add clearml tracker
    
    * support multiple train runs
    
    * remove bad code
    
    * add UI entries for config/hparams overrides
    
    * handle models in different tasks
    
    * run ruff format
    
    * tidy code based on code review
    
    ---------
    
    Co-authored-by: Eugen Ajechiloae <eugenajechiloae@gmail.com>
    eugen-ajechiloae-clearml and ajecc authored Feb 5, 2024
    Configuration menu
    Copy the full SHA
    0690116 View commit details
    Browse the repository at this point in the history

Commits on Feb 6, 2024

  1. Configuration menu
    Copy the full SHA
    ac51e59 View commit details
    Browse the repository at this point in the history
  2. Adds LlamaForQuestionAnswering class in modeling_llama.py along with …

    …AutoModel Support (#28777)
    
    * This is a test commit
    
    * testing commit
    
    * final commit with some changes
    
    * Removed copy statement
    
    * Fixed formatting issues
    
    * Fixed error added past_key_values in the forward method
    
    * Fixed a trailing whitespace. Damn the formatting rules are strict
    
    * Added the copy statement
    nakranivaibhav authored Feb 6, 2024
    Configuration menu
    Copy the full SHA
    2e7c942 View commit details
    Browse the repository at this point in the history
  3. Bump cryptography from 41.0.2 to 42.0.0 in /examples/research_project…

    …s/decision_transformer (#28879)
    
    Bump cryptography in /examples/research_projects/decision_transformer
    
    Bumps [cryptography](https://github.com/pyca/cryptography) from 41.0.2 to 42.0.0.
    - [Changelog](https://github.com/pyca/cryptography/blob/main/CHANGELOG.rst)
    - [Commits](pyca/cryptography@41.0.2...42.0.0)
    
    ---
    updated-dependencies:
    - dependency-name: cryptography
      dependency-type: direct:production
    ...
    
    Signed-off-by: dependabot[bot] <support@github.com>
    Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
    dependabot[bot] authored Feb 6, 2024
    Configuration menu
    Copy the full SHA
    e83227d View commit details
    Browse the repository at this point in the history
  4. [Docs] Update project names and links in awesome-transformers (#28878)

    Update project names and repository links in awesome-transformers
    khipp authored Feb 6, 2024
    Configuration menu
    Copy the full SHA
    1ea0bbd View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    ee2a340 View commit details
    Browse the repository at this point in the history
  6. Raise error when using save_only_model with `load_best_model_at_end…

    …` for DeepSpeed/FSDP (#28866)
    
    * Raise error when using `save_only_model` with `load_best_model_at_end` for DeepSpeed/FSDP
    
    * Update trainer.py
    pacman100 authored Feb 6, 2024
    Configuration menu
    Copy the full SHA
    5346db1 View commit details
    Browse the repository at this point in the history
  7. Fix FastSpeech2ConformerModelTest and skip it on CPU (#28888)

    * fix
    
    * fix
    
    ---------
    
    Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
    ydshieh and ydshieh authored Feb 6, 2024
    Configuration menu
    Copy the full SHA
    6529a5b View commit details
    Browse the repository at this point in the history
  8. Revert "[WIP] Hard error when ignoring tensors." (#28898)

    Revert "[WIP] Hard error when ignoring tensors. (#27484)"
    
    This reverts commit 2da28c4.
    ydshieh authored Feb 6, 2024
    Configuration menu
    Copy the full SHA
    76b4f66 View commit details
    Browse the repository at this point in the history
  9. unpin torch (#28892)

    * unpin torch
    
    * check
    
    * check
    
    * check
    
    ---------
    
    Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
    ydshieh and ydshieh authored Feb 6, 2024
    Configuration menu
    Copy the full SHA
    89439fe View commit details
    Browse the repository at this point in the history
  10. Configuration menu
    Copy the full SHA
    a1afec9 View commit details
    Browse the repository at this point in the history
  11. [Docs] Fix backticks in inline code and documentation links (#28875)

    Fix backticks in code blocks and documentation links
    khipp authored Feb 6, 2024
    Configuration menu
    Copy the full SHA
    4830f26 View commit details
    Browse the repository at this point in the history
  12. Hotfix - make torchaudio get the correct version in `torch_and_flax…

    …_job` (#28899)
    
    * check
    
    * check
    
    * check
    
    ---------
    
    Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
    ydshieh and ydshieh authored Feb 6, 2024
    Configuration menu
    Copy the full SHA
    40658be View commit details
    Browse the repository at this point in the history
  13. [Docs] Add missing language options and fix broken links (#28852)

    * Add missing entries to the language selector
    
    * Add links to the Colab and AWS Studio notebooks for ONNX
    
    * Use anchor links in CONTRIBUTING.md
    
    * Fix broken hyperlinks due to spaces
    
    * Fix links to OpenAI research articles
    
    * Remove confusing footnote symbols from author names, as they are also considered invalid markup
    khipp authored Feb 6, 2024
    Configuration menu
    Copy the full SHA
    1c31b7a View commit details
    Browse the repository at this point in the history

Commits on Feb 7, 2024

  1. fix: Fixed the documentation for logging_first_step by removing "ev…

    …aluate" (#28884)
    
    Fixed the documentation for logging_first_step by removing evaluate.
    Sai-Suraj-27 authored Feb 7, 2024
    Configuration menu
    Copy the full SHA
    64d1518 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    d9deddb View commit details
    Browse the repository at this point in the history
  3. Fix Keras scheduler import so it works for older versions of Keras (#…

    …28895)
    
    Fix our schedule import so it works for older versions of Keras
    Rocketknight1 authored Feb 7, 2024
    Configuration menu
    Copy the full SHA
    349a6e8 View commit details
    Browse the repository at this point in the history
  4. ⚠️ Raise Exception when trying to generate 0 tokens ⚠️ (#28621)

    * change warning to exception
    
    * Update src/transformers/generation/utils.py
    
    Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
    
    * validate `max_new_tokens` > 0 in `GenerationConfig`
    
    * fix truncation test parameterization in `TextGenerationPipelineTests`
    
    ---------
    
    Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
    danielkorat and gante authored Feb 7, 2024
    Configuration menu
    Copy the full SHA
    abf8f54 View commit details
    Browse the repository at this point in the history
  5. Update the cache number (#28905)

    * fix
    
    * fix
    
    * fix
    
    ---------
    
    Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
    ydshieh and ydshieh authored Feb 7, 2024
    Configuration menu
    Copy the full SHA
    308d2b9 View commit details
    Browse the repository at this point in the history
  6. Add npu device for pipeline (#28885)

    add npu device for pipeline
    
    Co-authored-by: unit_test <test@unit.com>
    statelesshz and unit_test authored Feb 7, 2024
    Configuration menu
    Copy the full SHA
    5f96855 View commit details
    Browse the repository at this point in the history

Commits on Feb 8, 2024

  1. [Docs] Fix placement of tilde character (#28913)

    Fix placement of tilde character
    khipp authored Feb 8, 2024
    Configuration menu
    Copy the full SHA
    328ade8 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    33df036 View commit details
    Browse the repository at this point in the history
  3. Fix utf-8 yaml load for marian conversion to pytorch in Windows (#28618)

    Fix utf-8 yaml in marian conversion
    SystemPanic authored Feb 8, 2024
    Configuration menu
    Copy the full SHA
    4b236ae View commit details
    Browse the repository at this point in the history
  4. [Core generation] Adds support for static KV cache (#27931)

    Co-authored-by: fxmarty <9808326+fxmarty@users.noreply.github.com>
    Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
    Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
    4 people authored Feb 8, 2024
    Configuration menu
    Copy the full SHA
    115ac94 View commit details
    Browse the repository at this point in the history
  5. Remove dead TF loading code (#28926)

    Remove dead code
    Rocketknight1 authored Feb 8, 2024
    Configuration menu
    Copy the full SHA
    693667b View commit details
    Browse the repository at this point in the history
  6. Configuration menu
    Copy the full SHA
    0b693e9 View commit details
    Browse the repository at this point in the history
  7. Configuration menu
    Copy the full SHA
    cc309fd View commit details
    Browse the repository at this point in the history
  8. Support batched input for decoder start ids (#28887)

    * support batched input for decoder start ids
    
    * Fix typos
    
    Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
    
    * minor changes
    
    * fix: decoder_start_id as list
    
    * empty commit
    
    * empty commit
    
    * empty commit
    
    * empty commit
    
    * empty commit
    
    * empty commit
    
    * empty commit
    
    * empty commit
    
    * empty commit
    
    ---------
    
    Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
    zucchini-nlp and gante authored Feb 8, 2024
    Configuration menu
    Copy the full SHA
    d628664 View commit details
    Browse the repository at this point in the history
  9. [Docs] Fix broken links and syntax issues (#28918)

    * Fix model documentation links in attention.md
    
    * Fix external link syntax
    
    * Fix target anchor names of section links
    
    * Fix copyright statement comments
    
    * Fix documentation headings
    khipp authored Feb 8, 2024
    Configuration menu
    Copy the full SHA
    2749e47 View commit details
    Browse the repository at this point in the history

Commits on Feb 9, 2024

  1. Fix max_position_embeddings default value for llama2 to 4096 #28241 (#…

    …28754)
    
    * Changed max_position_embeddings default value from 2048 to 4096
    
    * force push
    
    * Fixed formatting issues. Fixed missing argument in write_model.
    
    * Reverted to the default value 2048 in the Llama config. Added comments for the llama_version argument.
    
    * Fixed issue with default value value of max_position_embeddings in docstring
    
    * Updated help message for llama versions
    
    Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
    
    ---------
    
    Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
    karl-hajjar and amyeroberts authored Feb 9, 2024
    Configuration menu
    Copy the full SHA
    de11e65 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    ebf3ea2 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    d123e66 View commit details
    Browse the repository at this point in the history
  4. [i18n-de] Translate README.md to German (#28933)

    * Translate README.md to German
    
    * Add links to README_de.md
    
    * Remove invisible characters in README
    
    * Change to a formal tone and fix punctuation marks
    khipp authored Feb 9, 2024
    Configuration menu
    Copy the full SHA
    58e3d23 View commit details
    Browse the repository at this point in the history

Commits on Feb 12, 2024

  1. [Nougat] Fix pipeline (#28242)

    * Fix pipeline
    
    * Remove print statements
    
    * Address comments
    
    * Address issue
    
    * Remove unused imports
    NielsRogge authored Feb 12, 2024
    Configuration menu
    Copy the full SHA
    f278ef2 View commit details
    Browse the repository at this point in the history
  2. [Docs] Update README and default pipelines (#28864)

    * Update README and docs
    
    * Update README
    
    * Update README
    NielsRogge authored Feb 12, 2024
    Configuration menu
    Copy the full SHA
    ef5ab72 View commit details
    Browse the repository at this point in the history
  3. Convert torch_dtype as str to actual torch data type (i.e. "float…

    …16" …to `torch.float16`) (#28208)
    
    * Convert torch_dtype as str to actual torch data type (i.e. "float16" to torch.float16)
    
    * Check if passed torch_dtype is an attribute in torch
    
    * Update src/transformers/pipelines/__init__.py
    
    Check type via isinstance
    
    Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
    
    ---------
    
    Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
    KossaiSbai and amyeroberts authored Feb 12, 2024
    Configuration menu
    Copy the full SHA
    cf4c20b View commit details
    Browse the repository at this point in the history
  4. [pipelines] updated docstring with vqa alias (#28951)

    updated docstring with vqa alias
    cmahmut authored Feb 12, 2024
    Configuration menu
    Copy the full SHA
    1709886 View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    e30bbb2 View commit details
    Browse the repository at this point in the history
  6. Updated requirements for image-classification samples: datasets>=2.14…

    ….0 (#28974)
    
    Updated datasets requirements. Need a package version >= 2.14.0
    alekseyfa authored Feb 12, 2024
    Configuration menu
    Copy the full SHA
    792819f View commit details
    Browse the repository at this point in the history
  7. Always initialize tied output_embeddings if it has a bias term (#28947)

    Continue to initialize tied output_embeddings if it has a bias term
    
    The bias term is not tied, and so will need to be initialized accordingly.
    hackyon authored Feb 12, 2024
    Configuration menu
    Copy the full SHA
    136cd89 View commit details
    Browse the repository at this point in the history
  8. Clean up staging tmp checkpoint directory (#28848)

    clean up remaining tmp checkpoint dir
    
    Signed-off-by: woshiyyya <xiaoyunxuan1998@gmail.com>
    woshiyyya authored Feb 12, 2024
    Configuration menu
    Copy the full SHA
    c617f98 View commit details
    Browse the repository at this point in the history
  9. [Docs] Add language identifiers to fenced code blocks (#28955)

    Add language identifiers to code blocks
    khipp authored Feb 12, 2024
    Configuration menu
    Copy the full SHA
    fe3df9d View commit details
    Browse the repository at this point in the history
  10. [Docs] Add video section (#28958)

    Add video section
    NielsRogge authored Feb 12, 2024
    Configuration menu
    Copy the full SHA
    78ba9f4 View commit details
    Browse the repository at this point in the history
  11. [i18n-de] Translate CONTRIBUTING.md to German (#28954)

    * Translate contributing.md to German
    
    * Fix formatting issues in contributing.md
    
    * Address review comments
    
    * Fix capitalization
    khipp authored Feb 12, 2024
    Configuration menu
    Copy the full SHA
    d90acc1 View commit details
    Browse the repository at this point in the history

Commits on Feb 13, 2024

  1. [NllbTokenizer] refactor with added tokens decoder (#27717)

    * refactor with addedtokens decoder
    
    * style
    
    * get rid of lang code to id
    
    * style
    
    * keep some things for BC
    
    * update tests
    
    * add the mask token at the end of the vocab
    
    * nits
    
    * nits
    
    * fix final tests
    
    * style
    
    * nits
    
    * Update src/transformers/models/nllb/tokenization_nllb_fast.py
    
    Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
    
    * nits
    
    * style?
    
    * Update src/transformers/convert_slow_tokenizer.py
    
    * make it a tad bit more custom
    
    * ruff please stop
    Co-Authored by avidale
    
    <dale.david@mail.ru>
    
    * Update
    Co-authored-by: avidale
    <dale.david@mail.ru>
    
    * Update
    Co-authored-by: avidale <dale.david@mail.ru>
    
    * oupts
    
    * ouft
    
    * nites
    
    * test
    
    * fix the remaining failing tests
    
    * style
    
    * fix failing test
    
    * ficx other test
    
    * temp dir + test the raw init
    
    * update test
    
    * style
    
    ---------
    
    Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
    ArthurZucker and amyeroberts authored Feb 13, 2024
    Configuration menu
    Copy the full SHA
    b445675 View commit details
    Browse the repository at this point in the history
  2. Add sudachi_projection option to BertJapaneseTokenizer (#28503)

    * add sudachi_projection option
    
    * Upgrade sudachipy>=0.6.8
    
    * add a test case for sudachi_projection
    
    * Compatible with older versions of SudachiPy
    
    * make fixup
    
    * make style
    
    * error message for unidic download
    
    * revert jumanpp test cases
    
    * format options for sudachi_projection
    
    Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
    
    * format options for sudachi_split_mode and sudachi_dict_type
    
    * comment
    
    * add tests for full_tokenizer kwargs
    
    * pass projection arg directly
    
    * require_sudachi_projection
    
    * make style
    
    * revert upgrade sudachipy
    
    * check is_sudachi_projection_available()
    
    * revert dependency_version_table and bugfix
    
    * style format
    
    * simply raise ImportError
    
    Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
    
    * simply raise ImportError
    
    ---------
    
    Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
    hiroshi-matsuda-rit and ArthurZucker authored Feb 13, 2024
    Configuration menu
    Copy the full SHA
    da20209 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    3e70a20 View commit details
    Browse the repository at this point in the history
  4. Update configuration_llama.py: fixed broken link (#28946)

    * Update configuration_llama.py: fix broken link
    
    * [Nit] Explicit redirection not required
    
    Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
    
    ---------
    
    Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
    AdityaKane2001 and amyeroberts authored Feb 13, 2024
    Configuration menu
    Copy the full SHA
    3de6a6b View commit details
    Browse the repository at this point in the history
  5. [DETR] Update the processing to adapt masks & bboxes to reflect pad…

    …ding (#28363)
    
    * Update the processing so bbox coords are adjusted for padding
    
    * Just pad masks
    
    * Tidy up, add tests
    
    * Better tests
    
    * Fix yolos and mark as slow for pycocotols
    
    * Fix yolos - return_tensors
    
    * Clarify padding and normalization behaviour
    amyeroberts authored Feb 13, 2024
    Configuration menu
    Copy the full SHA
    bd4b83e View commit details
    Browse the repository at this point in the history

Commits on Feb 14, 2024

  1. ENH: Do not pass warning message in case quantization_config is in …

    …config but not passed as an arg (#28988)
    
    * Update auto.py
    
    * Update auto.py
    
    * Update src/transformers/quantizers/auto.py
    
    Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
    
    * Update src/transformers/quantizers/auto.py
    
    Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
    
    ---------
    
    Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
    younesbelkada and amyeroberts authored Feb 14, 2024
    Configuration menu
    Copy the full SHA
    1d12b8b View commit details
    Browse the repository at this point in the history
  2. ENH [AutoQuantizer]: enhance trainer + not supported quant methods (#…

    …28991)
    
    * enhance trainer + not support quant methods
    
    * remove all old logic
    
    * add version
    younesbelkada authored Feb 14, 2024
    Configuration menu
    Copy the full SHA
    164bdef View commit details
    Browse the repository at this point in the history
  3. Add StableLM (#28810)

    * Add `StableLM`
    
    * fix(model): re-create from `huggingface-cli add-new-model-like persimmon`
    
    * fix: re-add changes to address comments
    
    * fix(readme): add links to paper
    
    * fix(tokenization_auto): remove `GPTNeoXTokenizerFastFast` ref
    
    * fix(tests): re-add `@slow` decorator to integration tests
    
    * fix(tests): import slow...
    
    * fix(readme_hd): remove whitespace edit
    
    * fix(tokenizer): auto tokenizer tuple
    
    * skip doctests for `modeling_stablelm`
    jon-tow authored Feb 14, 2024
    Configuration menu
    Copy the full SHA
    de6029a View commit details
    Browse the repository at this point in the history
  4. Add SiglipForImageClassification and CLIPForImageClassification (#28952)

    * First draft
    
    * Add CLIPForImageClassification
    
    * Remove scripts
    
    * Fix doctests
    NielsRogge authored Feb 14, 2024
    Configuration menu
    Copy the full SHA
    63ffd56 View commit details
    Browse the repository at this point in the history
  5. AQLM quantizer support (#28928)

    * aqlm init
    
    * calibration and dtypes
    
    * docs
    
    * Readme update
    
    * is_aqlm_available
    
    * Simpler link in docs
    
    * Test TODO real reference
    
    * init _import_structure fix
    
    * AqlmConfig autodoc
    
    * integration aqlm
    
    * integrations in tests
    
    * docstring fix
    
    * legacy typing
    
    * Less typings
    
    * More kernels information
    
    * Performance -> Accuracy
    
    * correct tests
    
    * remoced multi-gpu test
    
    * Update docs/source/en/quantization.md
    
    Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
    
    * Update src/transformers/utils/quantization_config.py
    
    Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
    
    * Brought back multi-gpu tests
    
    * Update src/transformers/integrations/aqlm.py
    
    Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
    
    * Update tests/quantization/aqlm_integration/test_aqlm.py
    
    Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
    
    ---------
    
    Co-authored-by: Andrei Panferov <blacksamorez@yandex-team.ru>
    Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
    Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
    Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
    5 people authored Feb 14, 2024
    Configuration menu
    Copy the full SHA
    1ecf5f7 View commit details
    Browse the repository at this point in the history
  6. [Doc] Fix docbuilder - make BackboneMixin and `BackboneConfigMixi…

    …n` importable from `utils`. (#29002)
    
    * Trigger doc build
    
    * Test removing references
    
    * Importable from utils
    
    * Trigger another run on a new commit for testing
    amyeroberts authored Feb 14, 2024
    Configuration menu
    Copy the full SHA
    7252e8d View commit details
    Browse the repository at this point in the history
  7. Set the dataset format used by test_trainer to float32 (#28920)

    Co-authored-by: unit_test <test@unit.com>
    statelesshz and unit_test authored Feb 14, 2024
    Configuration menu
    Copy the full SHA
    69ca640 View commit details
    Browse the repository at this point in the history
  8. Introduce AcceleratorConfig dataclass (#28664)

    * Introduce acceleratorconfig dataclass
    
    * Extra second warn
    
    * Move import
    
    * Try moving import under is_accelerate_available
    
    * Quality
    
    * Apply suggestions from code review
    
    Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
    
    * Clean
    
    * Remove to_kwargs
    
    * Change version
    
    * Improve tests by including dispatch and split batches
    
    * Improve reliability
    
    * Update tests/trainer/test_trainer.py
    
    Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
    
    * Fixup tests and review nits
    
    * Make tests pass
    
    * protect import
    
    * Protect import
    
    * Empty-Commit
    
    * Make training_args.to_dict handle the AcceleratorConfig
    
    ---------
    
    Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
    muellerzr and amyeroberts authored Feb 14, 2024
    Configuration menu
    Copy the full SHA
    0507e69 View commit details
    Browse the repository at this point in the history
  9. Configuration menu
    Copy the full SHA
    354775b View commit details
    Browse the repository at this point in the history
  10. Mask Generation Task Guide (#28897)

    * Create mask_generation.md
    
    * add h1
    
    * add to toctree
    
    * Update docs/source/en/tasks/mask_generation.md
    
    Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
    
    * Update docs/source/en/tasks/mask_generation.md
    
    Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
    
    * Update docs/source/en/tasks/mask_generation.md
    
    Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
    
    * Update docs/source/en/tasks/mask_generation.md
    
    Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
    
    * Update docs/source/en/tasks/mask_generation.md
    
    Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
    
    * Update mask_generation.md
    
    * Update docs/source/en/tasks/mask_generation.md
    
    Co-authored-by: Maria Khalusova <kafooster@gmail.com>
    
    * Update docs/source/en/tasks/mask_generation.md
    
    Co-authored-by: Maria Khalusova <kafooster@gmail.com>
    
    * Update docs/source/en/tasks/mask_generation.md
    
    Co-authored-by: Maria Khalusova <kafooster@gmail.com>
    
    * Update docs/source/en/tasks/mask_generation.md
    
    Co-authored-by: Maria Khalusova <kafooster@gmail.com>
    
    * Update docs/source/en/tasks/mask_generation.md
    
    Co-authored-by: Maria Khalusova <kafooster@gmail.com>
    
    * Update docs/source/en/tasks/mask_generation.md
    
    Co-authored-by: Maria Khalusova <kafooster@gmail.com>
    
    * Update docs/source/en/tasks/mask_generation.md
    
    Co-authored-by: Maria Khalusova <kafooster@gmail.com>
    
    * Update docs/source/en/tasks/mask_generation.md
    
    Co-authored-by: Maria Khalusova <kafooster@gmail.com>
    
    * Update docs/source/en/tasks/mask_generation.md
    
    Co-authored-by: Maria Khalusova <kafooster@gmail.com>
    
    * Update docs/source/en/tasks/mask_generation.md
    
    Co-authored-by: Maria Khalusova <kafooster@gmail.com>
    
    * Update mask_generation.md
    
    * Update docs/source/en/tasks/mask_generation.md
    
    Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
    
    * Update docs/source/en/tasks/mask_generation.md
    
    Co-authored-by: Klaus Hipp <khipp@users.noreply.github.com>
    
    * Update docs/source/en/tasks/mask_generation.md
    
    Co-authored-by: Klaus Hipp <khipp@users.noreply.github.com>
    
    * Update docs/source/en/tasks/mask_generation.md
    
    Co-authored-by: Klaus Hipp <khipp@users.noreply.github.com>
    
    * Update docs/source/en/tasks/mask_generation.md
    
    Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
    
    * Update docs/source/en/tasks/mask_generation.md
    
    * Update mask_generation.md
    
    * Update mask_generation.md
    
    ---------
    
    Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
    Co-authored-by: Maria Khalusova <kafooster@gmail.com>
    Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
    Co-authored-by: Klaus Hipp <khipp@users.noreply.github.com>
    5 people authored Feb 14, 2024
    Configuration menu
    Copy the full SHA
    3f4e79d View commit details
    Browse the repository at this point in the history
  11. Add tie_weights() to LM heads and set bias in set_output_embeddings() (

    …#28948)
    
    * Add tie_weights() to LM heads and set bias in set_output_embeddings()
    
    The bias were not tied correctly in some LM heads, and this change should fix that.
    
    * Moving test_save_and_load_low_cpu_mem_usage to ModelTesterMixin
    
    * Adding _tie_weights() to MPNet and Vilt
    
    * Skip test for low cpu mem usage for Deta/DeformableDetr since they cannot init on meta device
    
    * Rename to test name to save_load to match the convention
    hackyon authored Feb 14, 2024
    Configuration menu
    Copy the full SHA
    725f4ad View commit details
    Browse the repository at this point in the history
  12. Backbone kwargs in config (#28784)

    * Enable instantiating model with pretrained backbone weights
    
    * Clarify pretrained import
    
    * Use load_backbone instead
    
    * Add backbone_kwargs to config
    
    * Pass kwargs to constructors
    
    * Fix up
    
    * Input verification
    
    * Add tests
    
    * Tidy up
    
    * Update tests/utils/test_backbone_utils.py
    
    Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
    
    ---------
    
    Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
    amyeroberts and ArthurZucker authored Feb 14, 2024
    Configuration menu
    Copy the full SHA
    0199a48 View commit details
    Browse the repository at this point in the history
  13. [TPU] Support PyTorch/XLA FSDP via SPMD (#28949)

    * Initial commit
    
    * Add guards for the global mesh
    
    * Address more comments
    
    * Move the dataloader into integrations/tpu.py
    
    * Fix linters
    
    * Make karg more explicitly
    
    * Remove the move device logic
    
    * Fix the CI
    
    * Fix linters
    
    * Re-enable checkpointing
    alanwaketan authored Feb 14, 2024
    Configuration menu
    Copy the full SHA
    5f06053 View commit details
    Browse the repository at this point in the history
  14. FIX [Trainer / tags]: Fix trainer + tags when users do not pass `"t…

    …ags"` to `trainer.push_to_hub()` (#29009)
    
    * fix trainer tags
    
    * add test
    younesbelkada authored Feb 14, 2024
    Configuration menu
    Copy the full SHA
    7a0fccc View commit details
    Browse the repository at this point in the history
  15. [CLeanup] Revert SDPA attention changes that got in the static kv c…

    …ache PR (#29027)
    
    * revert unrelated changes that got in
    
    * style
    ArthurZucker authored Feb 14, 2024
    Configuration menu
    Copy the full SHA
    609a176 View commit details
    Browse the repository at this point in the history

Commits on Feb 15, 2024

  1. Fix static generation when compiling! (#28937)

    * wow I was scared!
    
    * fix everything
    
    * nits
    
    * make it BC?
    
    * add todo
    
    * nits
    
    * is_tracing should still be used to pass tracing tests
    
    * nits
    
    * some nits to make sure genration works with static cache uncompiled
    
    * fix sdpa
    
    * fix FA2 for both static and dynamic in a better way?
    
    * style
    
    * fix-copies
    
    * fix fix copies
    
    * fix sequential beam searcg
    
    * style
    
    * use `keys_to_ignore`
    
    * nit
    
    * correct dtype inference when init
    
    * :( the fix for FA2 is still not optimal to investigate!
    
    * styling
    
    * nits
    
    * nit
    
    * this might work better
    
    * add comment
    
    * Update src/transformers/models/llama/modeling_llama.py
    
    * "position_ids" -> "cache_position"
    
    * style
    
    * nit
    
    * Remove changes that should no be propagatted just yet
    
    * Apply suggestions from code review
    
    * Styling
    
    * make sure we raise an errir for static cache with FA2 enabled
    
    * move  to the bottom of the signature
    
    * style
    
    * Update src/transformers/models/llama/modeling_llama.py
    
    Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
    
    * Update src/transformers/models/llama/modeling_llama.py
    
    * nit in the name
    
    ---------
    
    Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
    ArthurZucker and younesbelkada authored Feb 15, 2024
    Configuration menu
    Copy the full SHA
    f3788b0 View commit details
    Browse the repository at this point in the history
  2. Add cuda_custom_kernel in DETA (#28989)

    * enable graident checkpointing in DetaObjectDetection
    
    * fix missing part in original DETA
    
    * make style
    
    * make fix-copies
    
    * Revert "make fix-copies"
    
    This reverts commit 4041c86.
    
    * remove fix-copies of DetaDecoder
    
    * enable swin gradient checkpointing
    
    * fix gradient checkpointing in donut_swin
    
    * add tests for deta/swin/donut
    
    * Revert "fix gradient checkpointing in donut_swin"
    
    This reverts commit 1cf345e.
    
    * change supports_gradient_checkpointing pipeline to PreTrainedModel
    
    * Revert "add tests for deta/swin/donut"
    
    This reverts commit 6056ffb.
    
    * Revert "Revert "fix gradient checkpointing in donut_swin""
    
    This reverts commit 24e25d0.
    
    * Simple revert
    
    * enable deformable detr gradient checkpointing
    
    * add gradient in encoder
    
    * add cuda_custom_kernel function in MSDA
    
    * make style and fix input of DetaMSDA
    
    * make fix-copies
    
    * remove n_levels in input of DetaMSDA
    
    * minor changes
    
    * refactor custom_cuda_kernel like yoso format
    https://github.com/huggingface/transformers/blob/0507e69d34f8902422eb4977ec066dd6bef179a0/src/transformers/models/yoso/modeling_yoso.py#L53
    SangbumChoi authored Feb 15, 2024
    Configuration menu
    Copy the full SHA
    83e96dc View commit details
    Browse the repository at this point in the history
  3. DeformableDetrModel support fp16 (#29013)

    * Update ms_deform_attn_cuda.cu
    
    * Update ms_deform_attn_cuda.cuh
    
    * Update modeling_deformable_detr.py
    
    * Update src/transformers/models/deformable_detr/modeling_deformable_detr.py
    
    Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
    
    * Update modeling_deformable_detr.py
    
    * python utils/check_copies.py --fix_and_overwrite
    
    * Fix dtype missmatch error
    
    * Update test_modeling_deformable_detr.py
    
    * Update test_modeling_deformable_detr.py
    
    * Update modeling_deformable_detr.py
    
    * Update modeling_deformable_detr.py
    
    ---------
    
    Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
    DonggeunYu and amyeroberts authored Feb 15, 2024
    Configuration menu
    Copy the full SHA
    5b6fa23 View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    8a0ed0a View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    6d1f545 View commit details
    Browse the repository at this point in the history
  6. Patch to skip failing test_save_load_low_cpu_mem_usage tests (#29043)

    * Patch to skip currently failing tests
    
    * Whoops - wrong place
    amyeroberts authored Feb 15, 2024
    Configuration menu
    Copy the full SHA
    4156f51 View commit details
    Browse the repository at this point in the history
  7. Removed obsolete attribute setting for AQLM quantization. (#29034)

    removed redundant field
    Andrei Panferov authored Feb 15, 2024
    Configuration menu
    Copy the full SHA
    b0a7f44 View commit details
    Browse the repository at this point in the history
  8. Fix a tiny typo in `generation/utils.py::GenerateEncoderDecoderOutput…

    …`'s docstring (#29044)
    
    Update utils.py
    sadra-barikbin authored Feb 15, 2024
    Configuration menu
    Copy the full SHA
    f3aa7db View commit details
    Browse the repository at this point in the history

Commits on Feb 16, 2024

  1. Configuration menu
    Copy the full SHA
    1e402b9 View commit details
    Browse the repository at this point in the history
  2. Update all references to canonical models (#29001)

    * Script & Manual edition
    
    * Update
    LysandreJik authored Feb 16, 2024
    Configuration menu
    Copy the full SHA
    f497f56 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    8876ce8 View commit details
    Browse the repository at this point in the history
  4. Fix max_length criteria when using inputs_embeds (#28994)

    * fix max_length for inputs_embeds
    
    * make style
    
    * Update src/transformers/generation/utils.py
    
    Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
    
    * Static Cache: load models with MQA or GQA (#28975)
    
    * fix
    
    * fix tests
    
    * fix tests
    
    * Update src/transformers/generation/utils.py
    
    Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
    
    * more fixes
    
    * make style
    
    ---------
    
    Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
    Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
    3 people authored Feb 16, 2024
    Configuration menu
    Copy the full SHA
    aee11fe View commit details
    Browse the repository at this point in the history
  5. Support : Leverage Accelerate for object detection/segmentation models (

    #28312)
    
    * made changes for object detection models
    
    * added support for segmentation models.
    
    * Made changes for segmentaion models
    
    * Changed import statements
    
    * solving conflicts
    
    * removed conflicts
    
    * Resolving commits
    
    * Removed conflicts
    
    * Fix : Pixel_mask_value set to False
    Tanmaypatil123 authored Feb 16, 2024
    Configuration menu
    Copy the full SHA
    0eb4085 View commit details
    Browse the repository at this point in the history
  6. fix num_assistant_tokens with heuristic schedule (#28759)

    * fix heuristic num_assistant_tokens_schedule
    
    * Update src/transformers/generation/configuration_utils.py
    
    Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
    
    * Update src/transformers/generation/candidate_generator.py
    
    Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
    
    * Update utils.py
    
    check that candidate_generator.assistant_model exists since some some speculations (like ngram and PLD) don't have assistant_model attribute
    
    * Update src/transformers/generation/candidate_generator.py
    
    Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
    
    * Update tests/generation/test_utils.py
    
    Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
    
    * make fixup
    
    * merge conflict
    
    * fix docstring
    
    * make fixup
    
    ---------
    
    Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
    Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
    Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
    4 people authored Feb 16, 2024
    Configuration menu
    Copy the full SHA
    258da40 View commit details
    Browse the repository at this point in the history
  7. Configuration menu
    Copy the full SHA
    b262808 View commit details
    Browse the repository at this point in the history
  8. auto_find_batch_size isn't yet supported with DeepSpeed/FSDP. Raise…

    … error accrodingly. (#29058)
    
    Update trainer.py
    pacman100 authored Feb 16, 2024
    Configuration menu
    Copy the full SHA
    4c18ddb View commit details
    Browse the repository at this point in the history
  9. Honor trust_remote_code for custom tokenizers (#28854)

    * pass through trust_remote_code for dynamically loading unregistered tokenizers specified by config
    add test
    
    * change directories back to previous directory after test
    
    * fix ruff check
    
    * Add a note to that block for future in case we want to remove it later
    
    ---------
    
    Co-authored-by: Matt <rocketknight1@gmail.com>
    rl337 and Rocketknight1 authored Feb 16, 2024
    Configuration menu
    Copy the full SHA
    be42c24 View commit details
    Browse the repository at this point in the history
  10. Feature: Option to set the tracking URI for MLflowCallback. (#29032)

    * Added option to set tracking URI for MLflowCallback.
    
    * Added option to set tracking URI for MLflowCallback.
    
    * Changed  to  in docstring.
    seanswyi authored Feb 16, 2024
    Configuration menu
    Copy the full SHA
    161fe42 View commit details
    Browse the repository at this point in the history
  11. Fix trainer test wrt DeepSpeed + auto_find_bs (#29061)

    * FIx trainer test
    
    * Update tests/trainer/test_trainer.py
    
    Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
    
    ---------
    
    Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
    muellerzr and amyeroberts authored Feb 16, 2024
    Configuration menu
    Copy the full SHA
    636b032 View commit details
    Browse the repository at this point in the history
  12. Add chat support to text generation pipeline (#28945)

    * Add chat support to text generation pipeline
    
    * Better handling of single elements
    
    * Deprecate ConversationalPipeline
    
    * stash commit
    
    * Add missing add_special_tokens kwarg
    
    * Update chat templating docs to refer to TextGenerationPipeline instead of ConversationalPipeline
    
    * Add ✨TF✨ tests
    
    * @require_tf
    
    * Add type hint
    
    * Add specific deprecation version
    
    * Remove unnecessary do_sample
    
    * Remove todo - the discrepancy has been resolved
    
    * Update src/transformers/tokenization_utils_base.py
    
    Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
    
    * Update src/transformers/pipelines/text_generation.py
    
    Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
    
    ---------
    
    Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
    Rocketknight1 and amyeroberts authored Feb 16, 2024
    Configuration menu
    Copy the full SHA
    2f1003b View commit details
    Browse the repository at this point in the history
  13. [Docs] Spanish translation of task_summary.md (#28844)

    * Add task_summary to es/_toctree.yml
    
    * Add task_summary.md to docs/es
    
    * Change title of task_summary.md
    
    * Translate firsts paragraphs
    
    * Translate middle paragraphs
    
    * Translte the rest of the doc
    
    * Edit firts paragraph
    aaronjimv authored Feb 16, 2024
    Configuration menu
    Copy the full SHA
    ce4fff0 View commit details
    Browse the repository at this point in the history

Commits on Feb 19, 2024

  1. [Awq] Add peft support for AWQ (#28987)

    * add peft support for AWQ
    
    * Update src/transformers/quantizers/quantizer_awq.py
    
    Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
    
    * fix
    
    ---------
    
    Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
    younesbelkada and amyeroberts authored Feb 19, 2024
    Configuration menu
    Copy the full SHA
    864c8e6 View commit details
    Browse the repository at this point in the history
  2. FIX [bnb / tests]: Fix currently failing bnb tests (#29092)

    Update test_mixed_int8.py
    younesbelkada authored Feb 19, 2024
    Configuration menu
    Copy the full SHA
    a75a6c9 View commit details
    Browse the repository at this point in the history
  3. fix the post-processing link (#29091)

    The link in evaluation was missing a hyphen between post and processing. I fixed this, for English only. Someone with the ability to do a global search/replace should fix the other languages (if indeed they have this issue)/
    davies-w authored Feb 19, 2024
    Configuration menu
    Copy the full SHA
    593230f View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    9830858 View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    79132d4 View commit details
    Browse the repository at this point in the history
  6. change version (#29097)

    * change version
    
    * nuke
    
    * this doesn't make sense
    
    * update some requirements.py
    
    * revert + no main
    
    * nits
    
    * change cache number
    
    * more pin
    
    * revert
    
    ---------
    
    Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
    ArthurZucker and ydshieh authored Feb 19, 2024
    Configuration menu
    Copy the full SHA
    b2724d7 View commit details
    Browse the repository at this point in the history
  7. [Docs] Add resources (#28705)

    * Add resource
    
    * Add more resources
    
    * Add resources
    
    * Apply suggestions from code review
    
    Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
    
    * Remove mention
    
    * Remove pipeline tags
    
    ---------
    
    Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
    NielsRogge and amyeroberts authored Feb 19, 2024
    Configuration menu
    Copy the full SHA
    07e3454 View commit details
    Browse the repository at this point in the history
  8. ENH: added new output_logits option to generate function (#28667)

    output_logits option behaves like output_scores, but returns the raw, unprocessed prediction logit scores,
    ie. the values before they undergo logit processing and/or warping. The latter happens by default for the
    regular output scores.
    
    It's useful to have the unprocessed logit scores in certain circumstances. For example, unprocessed logit scores
    are very useful with causallm models when one wants to determine the probability of a certain answer, e.g.
    when asking a question with a yes/no answer. In that case getting the next-token probabilities of both "yes" and
    "no" (and/or their relative ratio) is of interest for classification. The reason for getting these _before_ logit
    processing and/or warping is b/c a) that can change the probabilities or b) reject the tokens of interest / reduce
    the number of tokens to just 1.
    
    For an example use-case see paper TabLLM: Few-shot Classification of Tabular Data with Large Language Models
    by Stefan Hegselmann, Alejandro Buendia, Hunter Lang, Monica Agrawal, Xiaoyi Jiang, and David Sontag.
    https://arxiv.org/abs/2210.10723
    
    In addition:
    - added dedicated unit test: tests/generation/test_utils/test_return_unprocessed_logit_scores
      which tests return of logics with output_logits=True in generation.
    - set output_logits=True in all other generation unit tests, that also have output_scores=True.
    
    Implemented @gante's and @amyeroberts review feedback
    
    Co-authored-by: kx79wq <max.baak@ing.com>
    mbaak and kx79wq authored Feb 19, 2024
    Configuration menu
    Copy the full SHA
    08cd694 View commit details
    Browse the repository at this point in the history
  9. Bnb test fix for different hardwares (#29066)

    * generated text on A10G
    
    * generated text in CI
    
    * Apply suggestions from code review
    
    add explanatory comments
    
    Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
    
    ---------
    
    Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
    Titus-von-Koeller and younesbelkada authored Feb 19, 2024
    Configuration menu
    Copy the full SHA
    5ce90f3 View commit details
    Browse the repository at this point in the history
  10. Fix two tiny typos in `pipelines/base.py::Pipeline::_sanitize_paramet…

    …ers()`'s docstring (#29102)
    
    * Update base.py
    
    * Fix a typo
    sadra-barikbin authored Feb 19, 2024
    Configuration menu
    Copy the full SHA
    a4851d9 View commit details
    Browse the repository at this point in the history
  11. storing & logging gradient norm in trainer (#27326)

    * report grad_norm during training
    
    * support getting grad_norm from deepspeed
    shijie-wu authored Feb 19, 2024
    Configuration menu
    Copy the full SHA
    4f09d0f View commit details
    Browse the repository at this point in the history

Commits on Feb 20, 2024

  1. Fixed nll with label_smoothing to just nll (#28708)

    * Fixed nll with label_smoothing to nll
    
    * Resolved conflict by rebase
    
    * Fixed nll with label_smoothing to nll
    
    * Resolved conflict by rebase
    
    * Added label_smoothing to config file
    
    * Fixed nits
    nileshkokane01 authored Feb 20, 2024
    Configuration menu
    Copy the full SHA
    49c0b29 View commit details
    Browse the repository at this point in the history
  2. [gradient_checkpointing] default to use it for torch 2.3 (#28538)

    * default to use it
    
    * style
    ArthurZucker authored Feb 20, 2024
    Configuration menu
    Copy the full SHA
    9094abe View commit details
    Browse the repository at this point in the history
  3. Move misplaced line (#29117)

    Move misplaced line, improve code comment
    kno10 authored Feb 20, 2024
    Configuration menu
    Copy the full SHA
    a7ff2f2 View commit details
    Browse the repository at this point in the history
  4. FEAT [Trainer / bnb]: Add RMSProp from bitsandbytes to HF `Trai…

    …ner` (#29082)
    
    * add RMSProp to Trainer
    
    * revert some change
    
    * Update src/transformers/trainer.py
    
    Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
    
    ---------
    
    Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
    younesbelkada and amyeroberts authored Feb 20, 2024
    Configuration menu
    Copy the full SHA
    f7ef7ce View commit details
    Browse the repository at this point in the history
  5. Abstract image processor arg checks. (#28843)

    * abstract image processor arg checks.
    
    * fix signatures and quality
    
    * add validate_ method to rescale-prone processors
    
    * add more validations
    
    * quality
    
    * quality
    
    * fix formatting
    
    Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
    
    * fix formatting
    
    Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
    
    * fix formatting
    
    Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
    
    * Fix formatting mishap
    
    Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
    
    * fix crop_size compatibility
    
    * fix default mutable arg
    
    * fix segmentation map + image arg validity
    
    * remove segmentation check from arg validation
    
    * fix quality
    
    * fix missing segmap
    
    * protect PILImageResampling type
    
    * Apply suggestions from code review
    
    Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
    
    * add back segmentation maps check
    
    ---------
    
    Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
    molbap and amyeroberts authored Feb 20, 2024
    Configuration menu
    Copy the full SHA
    1c9134f View commit details
    Browse the repository at this point in the history
  6. FIX [bnb / tests] Propagate the changes from #29092 to 4-bit tests (

    #29122)
    
    * forgot to push the changes for 4bit ..
    
    * trigger CI
    younesbelkada authored Feb 20, 2024
    Configuration menu
    Copy the full SHA
    ff76e7c View commit details
    Browse the repository at this point in the history
  7. Configuration menu
    Copy the full SHA
    7d312ad View commit details
    Browse the repository at this point in the history
  8. Configuration menu
    Copy the full SHA
    a7755d2 View commit details
    Browse the repository at this point in the history
  9. [cuda kernels] only compile them when initializing (#29133)

    * only compile when needed
    
    * fix mra as well
    
    * fix yoso as well
    
    * update
    
    * rempve comment
    
    * Update src/transformers/models/deformable_detr/modeling_deformable_detr.py
    
    * Update src/transformers/models/deformable_detr/modeling_deformable_detr.py
    
    * opps
    
    * Update src/transformers/models/deta/modeling_deta.py
    
    * nit
    ArthurZucker authored Feb 20, 2024
    Configuration menu
    Copy the full SHA
    5e95dca View commit details
    Browse the repository at this point in the history
  10. FIX [PEFT / Trainer ] Handle better peft + quantized compiled mod…

    …els (#29055)
    
    * handle peft + compiled models
    
    * add tests
    
    * fixup
    
    * adapt from suggestions
    
    * clarify comment
    younesbelkada authored Feb 20, 2024
    Configuration menu
    Copy the full SHA
    efdd436 View commit details
    Browse the repository at this point in the history
  11. [Core tokenization] add_dummy_prefix_space option to help with la…

    …test issues (#28010)
    
    * add add_dummy_prefix_space option to slow
    
    * checking kwargs might be better. Should be there for all spm tokenizer IMO
    
    * nits
    
    * fix copies
    
    * more copied
    
    * nits
    
    * add prefix space
    
    * nit
    
    * nits
    
    * Update src/transformers/convert_slow_tokenizer.py
    
    * fix inti
    
    * revert wrong styling
    
    * fix
    
    * nits
    
    * style
    
    * updates
    
    * make sure we use slow tokenizer for conversion instead of looking for the decoder
    
    * support llama ast well
    
    * update llama tokenizer fast
    
    * nits
    
    * nits nits nits
    
    * update the doc
    
    * update
    
    * update to fix tests
    
    * skip unrelated tailing test
    
    * Update src/transformers/convert_slow_tokenizer.py
    
    * add proper testing
    
    * test decode as well
    
    * more testing
    
    * format
    
    * fix llama test
    
    * Apply suggestions from code review
    ArthurZucker authored Feb 20, 2024
    Configuration menu
    Copy the full SHA
    15cfe38 View commit details
    Browse the repository at this point in the history
  12. Revert low cpu mem tie weights (#29135)

    * Revert "Add tie_weights() to LM heads and set bias in set_output_embeddings() (#28948)"
    
    This reverts commit 725f4ad.
    
    * Revert "Patch to skip failing `test_save_load_low_cpu_mem_usage` tests (#29043)"
    
    This reverts commit 4156f51.
    amyeroberts authored Feb 20, 2024
    Configuration menu
    Copy the full SHA
    0996a10 View commit details
    Browse the repository at this point in the history
  13. Add support for fine-tuning CLIP-like models using contrastive-image-…

    …text example (#29070)
    
    * add support for siglip and chinese-clip model training with contrastive-image-text example
    
    * codebase fixups
    tjs-intel authored Feb 20, 2024
    Configuration menu
    Copy the full SHA
    ee3af60 View commit details
    Browse the repository at this point in the history
  14. Save (circleci) cache at the end of a job (#29141)

    nice job
    
    Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
    ydshieh and ydshieh authored Feb 20, 2024
    Configuration menu
    Copy the full SHA
    7688d8d View commit details
    Browse the repository at this point in the history
  15. Configuration menu
    Copy the full SHA
    b8b1647 View commit details
    Browse the repository at this point in the history