Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update training #4

Merged
merged 76 commits into from
Apr 2, 2024
Merged

Update training #4

merged 76 commits into from
Apr 2, 2024

Commits on Mar 22, 2024

  1. [quality] update quality check to make sure we check imports 😈 (hug…

    …gingface#29771)
    
    * update quality check
    
    * make it nice
    
    * update
    
    * let's make sure it runs and we have the logs actually
    
    * update workflow
    
    * nits
    ArthurZucker authored Mar 22, 2024
    Configuration menu
    Copy the full SHA
    e68ff30 View commit details
    Browse the repository at this point in the history
  2. Fix type hint for train_dataset param of Trainer.__init__() to allow …

    …IterableDataset. Issue 29678 (huggingface#29738)
    
    * Fixed typehint for train_dataset param in Trainer.__init__().  Added IterableDataset option.
    
    * make fixup
    stevemadere authored Mar 22, 2024
    Configuration menu
    Copy the full SHA
    3479161 View commit details
    Browse the repository at this point in the history
  3. Enable AMD docker build CI (huggingface#29803)

    * enable amd ci
    
    * remove unnecessary clean up
    IlyasMoutawwakil authored Mar 22, 2024
    Configuration menu
    Copy the full SHA
    aa17cf9 View commit details
    Browse the repository at this point in the history
  4. Correct llava mask & fix missing setter for vocab_size (huggingface…

    …#29389)
    
    * correct llava mask
    
    * fix vipllava as wlel
    
    * mask out embedding for padding tokens
    
    * add test
    
    * fix style
    
    * add setter
    
    * fix test on suggestion
    fxmarty authored Mar 22, 2024
    Configuration menu
    Copy the full SHA
    13b2370 View commit details
    Browse the repository at this point in the history
  5. rm input dtype change in CPU (huggingface#28631)

    * rm input dtype change in CPU
    
    * add warning when use CPU low-precision
    
    * rm useless logging
    jiqing-feng authored Mar 22, 2024
    Configuration menu
    Copy the full SHA
    e85654f View commit details
    Browse the repository at this point in the history
  6. Configuration menu
    Copy the full SHA
    34e07f4 View commit details
    Browse the repository at this point in the history
  7. replaced concatenation to f-strings to improve readability and unify … (

    huggingface#29785)
    
    replaced concatenation to f-strings to improve readability and unify with the rest code
    igeni authored Mar 22, 2024
    Configuration menu
    Copy the full SHA
    884b221 View commit details
    Browse the repository at this point in the history
  8. Configuration menu
    Copy the full SHA
    2e7cb46 View commit details
    Browse the repository at this point in the history
  9. Complete security policy with mentions of remote code (huggingface#29707

    )
    
    * Security policy
    
    * Apply suggestions from code review
    
    Co-authored-by: Luc Georges <McPatate@users.noreply.github.com>
    Co-authored-by: Michelle Habonneau <83347449+Michellehbn@users.noreply.github.com>
    
    * Update SECURITY.md
    
    Co-authored-by: Diogo Teles Sant'Anna <diogoteles@google.com>
    
    ---------
    
    Co-authored-by: Luc Georges <McPatate@users.noreply.github.com>
    Co-authored-by: Michelle Habonneau <83347449+Michellehbn@users.noreply.github.com>
    Co-authored-by: Diogo Teles Sant'Anna <diogoteles@google.com>
    4 people authored Mar 22, 2024
    Configuration menu
    Copy the full SHA
    7e1413d View commit details
    Browse the repository at this point in the history
  10. [SuperPoint] Fix doc example (huggingface#29816)

    [SuperPoint] Fix doc example
    amyeroberts authored Mar 22, 2024
    Configuration menu
    Copy the full SHA
    c5f0288 View commit details
    Browse the repository at this point in the history

Commits on Mar 23, 2024

  1. [DOCS] Fix typo for llava next docs (huggingface#29829)

    Fix typo for llava next docs
    aliencaocao authored Mar 23, 2024
    Configuration menu
    Copy the full SHA
    dafe370 View commit details
    Browse the repository at this point in the history

Commits on Mar 24, 2024

  1. model_summary.md - Restore link to Harvard's Annotated Transformer. (h…

    …uggingface#29702)
    
    * model_summary.md - Add link to Harvard's Annotated Transformer.
    
    * model_summary.md - slight wording change + capitalize name of the paper
    
    * model_summary.md - moves the Annotated Transformer link in a praenthesis next to the link to the original paper (great idea, stevhliu!)
    
    * model_summary.md - moves the Annotated Transformer link in a praenthesis next to the link to the original paper (commit pt. 2, accidentally removed "has" in pt. 1)
    gamepad-coder authored Mar 24, 2024
    Configuration menu
    Copy the full SHA
    76a33a1 View commit details
    Browse the repository at this point in the history

Commits on Mar 25, 2024

  1. Remove static pretrained maps from the library's internals (huggingfa…

    …ce#29112)
    
    * [test_all] Remove static pretrained maps from the library's internals
    
    * Deprecate archive maps instead of removing them
    
    * Revert init changes
    
    * [test_all] Deprecate instead of removing
    
    * [test_all] PVT v2 support
    
    * [test_all] Tests should all pass
    
    * [test_all] Style
    
    * Address review comments
    
    * Update src/transformers/models/deprecated/_archive_maps.py
    
    Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
    
    * Update src/transformers/models/deprecated/_archive_maps.py
    
    Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
    
    * [test_all] trigger tests
    
    * [test_all] LLAVA
    
    * [test_all] Bad rebase
    
    ---------
    
    Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
    LysandreJik and ArthurZucker authored Mar 25, 2024
    Configuration menu
    Copy the full SHA
    39114c0 View commit details
    Browse the repository at this point in the history
  2. Fix the behavior of collecting 'num_input_tokens_seen' (huggingface#2…

    …9099)
    
    fix the behavior of collecting 'num_input_tokens_seen'
    
    See huggingface#28791 for more details.
    youliangh authored Mar 25, 2024
    Configuration menu
    Copy the full SHA
    afe73ae View commit details
    Browse the repository at this point in the history
  3. Populate torch_dtype from model to pipeline (huggingface#28940)

    * Populate torch_dtype from model to pipeline
    
    Signed-off-by: B-Step62 <yuki.watanabe@databricks.com>
    
    * use property
    
    Signed-off-by: B-Step62 <yuki.watanabe@databricks.com>
    
    * lint
    
    Signed-off-by: B-Step62 <yuki.watanabe@databricks.com>
    
    * Remove default handling
    
    Signed-off-by: B-Step62 <yuki.watanabe@databricks.com>
    
    ---------
    
    Signed-off-by: B-Step62 <yuki.watanabe@databricks.com>
    B-Step62 authored Mar 25, 2024
    Configuration menu
    Copy the full SHA
    8e9a220 View commit details
    Browse the repository at this point in the history
  4. fix 😭

    ArthurZucker committed Mar 25, 2024
    Configuration menu
    Copy the full SHA
    00a09ed View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    e3e16dd View commit details
    Browse the repository at this point in the history
  6. remove quotes in code example (huggingface#29812)

    Co-authored-by: Johannes <johannes.kolbe@tech.better.team>
    johko and Johannes authored Mar 25, 2024
    Configuration menu
    Copy the full SHA
    7eb3ba8 View commit details
    Browse the repository at this point in the history

Commits on Mar 26, 2024

  1. Add warnings if training args differ from checkpoint trainer state (h…

    …uggingface#29255)
    
    * add warnings if training args differ from checkpoint args stored in trainer_state.json
    
    * run formatting and styling
    
    * add a test
    
    * format and styling
    
    ---------
    
    Co-authored-by: Jonathan Flynn <jonl.flynn@guardian.co.uk>
    jonflynng and Jonathan Flynn authored Mar 26, 2024
    Configuration menu
    Copy the full SHA
    b5a6d6e View commit details
    Browse the repository at this point in the history
  2. Replace 'decord' with 'av' in VideoClassificationPipeline (huggingfac…

    …e#29747)
    
    * replace the 'decord' with 'av' in VideoClassificationPipeline
    
    * fix the check of backend in VideoClassificationPipeline
    
    * adjust the order of imports
    
    * format 'video_classification.py'
    
    * format 'video_classification.py' with ruff
    
    ---------
    
    Co-authored-by: wanqiancheng <13541261013@163.com>
    Tyx-main and wanqiancheng authored Mar 26, 2024
    Configuration menu
    Copy the full SHA
    b32bf85 View commit details
    Browse the repository at this point in the history
  3. Fix header in IFE task guide (huggingface#29859)

    Update image_feature_extraction.md
    merveenoyan authored Mar 26, 2024
    Configuration menu
    Copy the full SHA
    de81a67 View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    b9ceb03 View commit details
    Browse the repository at this point in the history
  5. Allow bos_token_id is None during the generation with `inputs_embed…

    …s` (huggingface#29772)
    
    * update
    
    * add ut
    
    * update
    LZHgrla authored Mar 26, 2024
    Configuration menu
    Copy the full SHA
    998b5bb View commit details
    Browse the repository at this point in the history
  6. Add cosine_with_min_lr scheduler in Trainer (huggingface#29341)

    * Add cosine_with_min_lr scheduler
    
    * Update error message for missing min_lr or min_lr_rate
    liuyanyi authored Mar 26, 2024
    Configuration menu
    Copy the full SHA
    ef60995 View commit details
    Browse the repository at this point in the history
  7. Disable AMD memory benchmarks (huggingface#29871)

    * remove py3nvml to skip amd memory benchmarks
    
    * uninstall pynvml from docker images
    IlyasMoutawwakil authored Mar 26, 2024
    Configuration menu
    Copy the full SHA
    07d7952 View commit details
    Browse the repository at this point in the history
  8. Configuration menu
    Copy the full SHA
    f01e160 View commit details
    Browse the repository at this point in the history

Commits on Mar 27, 2024

  1. Support num_attention_heads != num_key_value_heads in Flax Llama …

    …Implementation (huggingface#29557)
    
    * fix tinyllama flax modelling
    
    * rename vars to minimize changes
    
    * move
    
    * formatting
    
    * remove unused var
    bminixhofer authored Mar 27, 2024
    Configuration menu
    Copy the full SHA
    8e08aca View commit details
    Browse the repository at this point in the history
  2. Add Qwen2MoE (huggingface#29377)

    * add support for qwen2 MoE models
    
    * update docs
    
    * add support for qwen2 MoE models
    
    * update docs
    
    * update model name & test
    
    * update readme
    
    * update class names & readme & model_doc of Qwen2MoE.
    
    * update architecture name
    
    * fix qwen2_moe tests
    
    * use Qwen2Tokenizer instead of Qwen2MoeTokenizer
    
    * update modeling_qwen2_moe.py
    
    * fix model architecture
    
    * fix qwen2_moe tests
    
    * use Qwen2Tokenizer instead of Qwen2MoeTokenizer
    
    * update modeling_qwen2_moe.py
    
    * fix model architecture
    
    * fix style
    
    * fix test when there are sparse and non sparse layers
    
    * fixup
    
    * Update README.md
    
    Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
    
    * fixup
    
    * fixup
    
    * add archive back
    
    * add support for qwen2 MoE models
    
    * update docs
    
    * update model name & test
    
    * update readme
    
    * update class names & readme & model_doc of Qwen2MoE.
    
    * update architecture name
    
    * fix qwen2_moe tests
    
    * use Qwen2Tokenizer instead of Qwen2MoeTokenizer
    
    * update modeling_qwen2_moe.py
    
    * fix model architecture
    
    * fixup
    
    * fix qwen2_moe tests
    
    * use Qwen2Tokenizer instead of Qwen2MoeTokenizer
    
    * fix style
    
    * fix test when there are sparse and non sparse layers
    
    * fixup
    
    * add archive back
    
    * fix integration test
    
    * fixup
    
    ---------
    
    Co-authored-by: bozheng-hit <dsoul0621@gmail.com>
    Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
    3 people authored Mar 27, 2024
    Configuration menu
    Copy the full SHA
    1c39974 View commit details
    Browse the repository at this point in the history
  3. Mamba slow_forward gradient fix (huggingface#29563)

    * FIX: Cached slow forward in mamba
    - additionally added mamba cached test
    - added unused test (mamba causal lm forward and backward)
    - fixed typo: "causl" --> "causal"
    
    * formatting
    
    * fix: use real `slow_forward` call instead of torch module's
    
    * add shape assertion for mixer block test
    
    * adjust shape assertion
    vasqu authored Mar 27, 2024
    Configuration menu
    Copy the full SHA
    cefb819 View commit details
    Browse the repository at this point in the history
  4. Fix 29807, sinusoidal positional encodings overwritten by post_init() (

    …huggingface#29813)
    
    * Check for requires_grad when initing weights
    
    * Add unit test
    
    * Move sinusoidal positional encoding generation after post_init()
    
    * Add modules to skip init list
    
    * Move create_sinusoidal_embeddings to _init_weights
    hovnatan authored Mar 27, 2024
    Configuration menu
    Copy the full SHA
    a81cf9e View commit details
    Browse the repository at this point in the history
  5. Reimplement "Automatic safetensors conversion when lacking these file…

    …s" (huggingface#29846)
    
    * Automatic safetensors conversion when lacking these files (huggingface#29390)
    
    * Automatic safetensors conversion when lacking these files
    
    * Remove debug
    
    * Thread name
    
    * Typo
    
    * Ensure that raises do not affect the main thread
    
    * Catch all errors
    LysandreJik authored Mar 27, 2024
    Configuration menu
    Copy the full SHA
    4d8427f View commit details
    Browse the repository at this point in the history
  6. Configuration menu
    Copy the full SHA
    31c575b View commit details
    Browse the repository at this point in the history
  7. Move eos_token_id to stopping criteria (huggingface#29459)

    * add eos stopping criteria
    
    * minor fix
    
    * Update tests/generation/test_stopping_criteria.py
    
    Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
    
    * check eos is not None and fix tests
    
    * make style and fixup
    
    * Update src/transformers/generation/stopping_criteria.py
    
    Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
    
    * Update tests/generation/test_utils.py
    
    Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
    
    * Update tests/generation/test_utils.py
    
    Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
    
    * Update src/transformers/generation/__init__.py
    
    Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
    
    * Update src/transformers/generation/stopping_criteria.py
    
    Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
    
    * Update src/transformers/generation/stopping_criteria.py
    
    Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
    
    * Update src/transformers/generation/stopping_criteria.py
    
    Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
    
    * camel case everywhere
    
    * call stopping criteria list for candidate ids
    
    * make style  and fixup
    
    * Empty commit
    
    * Empty commit to pass flaky test
    
    * set max length in PromptLookupCandidateGenerator
    
    * Update src/transformers/generation/utils.py
    
    Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
    
    * lets fix this typo in docs
    
    * Update src/transformers/generation/utils.py
    
    Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
    
    * Update src/transformers/generation/utils.py
    
    Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
    
    * update PR
    
    * empty commit
    
    ---------
    
    Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
    Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
    3 people authored Mar 27, 2024
    Configuration menu
    Copy the full SHA
    0efcf32 View commit details
    Browse the repository at this point in the history
  8. add Cambricon MLUs support (huggingface#29627)

    * add Cambricon MLUs support
    
    * fix mlu device rng state
    
    * up for quality check
    
    * up mlu to support fp16
    
    * fix mlu device dependency error
    
    * fix mlu device dependency error
    
    * enable mlu device for bf16
    
    * fix mlu device memory tracker
    huismiling authored Mar 27, 2024
    Configuration menu
    Copy the full SHA
    7576974 View commit details
    Browse the repository at this point in the history
  9. MixtralSparseMoeBlock: add gate jitter (huggingface#29865)

    This commit adds gate jitter to MixtralSparseMoeBlock's input data
    before passing it through the MoE layer, if turned on.
    lorenzoverardo authored Mar 27, 2024
    Configuration menu
    Copy the full SHA
    a25037b View commit details
    Browse the repository at this point in the history

Commits on Mar 28, 2024

  1. Configuration menu
    Copy the full SHA
    d9dc993 View commit details
    Browse the repository at this point in the history
  2. [make fix-copies] update and help (huggingface#29924)

    * add some help
    
    * style
    ArthurZucker authored Mar 28, 2024
    Configuration menu
    Copy the full SHA
    b256516 View commit details
    Browse the repository at this point in the history
  3. [GptNeox] don't gather on pkv when using the trainer (huggingface#2…

    …9892)
    
    don't gather on pkv when using the trainer
    ArthurZucker authored Mar 28, 2024
    Configuration menu
    Copy the full SHA
    543889f View commit details
    Browse the repository at this point in the history
  4. [pipeline]. Zero shot add doc warning (huggingface#29845)

    * add doc warning
    
    * fix build pr
    ArthurZucker authored Mar 28, 2024
    Configuration menu
    Copy the full SHA
    3a7e683 View commit details
    Browse the repository at this point in the history
  5. Adding Flash Attention 2 Support for GPT2 (huggingface#29226)

    * First commit to add flash attention 2 for GPT-2
    
    * more improvements
    
    * Make GPT2 pass tests and fixed Decison Transformers copies
    
    * Fixed missing arg
    
    * fix copies
    
    * Added expected speedup
    
    * Update src/transformers/models/gpt2/modeling_gpt2.py
    
    Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
    
    * Update src/transformers/models/gpt2/modeling_gpt2.py
    
    Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
    
    * Update src/transformers/models/gpt2/modeling_gpt2.py
    
    Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
    
    * Added test
    
    * Fixed attn attribute
    
    * Update docs/source/en/model_doc/gpt2.md
    
    Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
    
    * Update docs/source/en/model_doc/gpt2.md
    
    Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
    
    * Update Decision transformer attentions
    
    * More updates
    
    * Passing tests
    
    * Fix copies
    
    * Fix copies part 2
    
    * Decision transformer updates
    
    * Update src/transformers/models/gpt2/modeling_gpt2.py
    
    Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
    
    * Fix copies
    
    * Decision transformer not supporting flash attn
    
    * Addressed comments
    
    * Addressed comments
    
    * Addressed comments
    
    ---------
    
    Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
    Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
    3 people authored Mar 28, 2024
    Configuration menu
    Copy the full SHA
    22d159d View commit details
    Browse the repository at this point in the history
  6. Configuration menu
    Copy the full SHA
    7c19faf View commit details
    Browse the repository at this point in the history
  7. Tests: replace torch.testing.assert_allclose by `torch.testing.asse…

    …rt_close` (huggingface#29915)
    
    * replace torch.testing.assert_allclose by torch.testing.assert_close
    
    * missing atol rtol
    gante authored Mar 28, 2024
    Configuration menu
    Copy the full SHA
    248d5d2 View commit details
    Browse the repository at this point in the history
  8. Configuration menu
    Copy the full SHA
    c9d2e85 View commit details
    Browse the repository at this point in the history
  9. Safe import of LRScheduler (huggingface#29919)

    * Safe import of LRScheduler
    
    * Update src/transformers/trainer_pt_utils.py
    
    Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
    
    * Update src/transformers/trainer_pt_utils.py
    
    Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
    
    * Fix up
    
    ---------
    
    Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
    amyeroberts and ArthurZucker authored Mar 28, 2024
    Configuration menu
    Copy the full SHA
    855b95c View commit details
    Browse the repository at this point in the history
  10. add functions to inspect model and optimizer status to trainer.py (hu…

    …ggingface#29838)
    
    * add functions to get number of params which require grad, get optimizer group for parameters and get learning rates of param groups to trainer.py
    
    * add tests and raise ValueError when optimizer is None
    
    * add second layer to test and freeze its weigths
    
    * check if torch is available before running tests
    
    * use decorator to check if torch is available
    
    Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
    
    * fix test indentation
    
    Co-authored-by: Zach Mueller <muellerzr@gmail.com>
    
    ---------
    
    Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
    Co-authored-by: Zach Mueller <muellerzr@gmail.com>
    3 people authored Mar 28, 2024
    Configuration menu
    Copy the full SHA
    aac7099 View commit details
    Browse the repository at this point in the history
  11. RoPE models: add numerical sanity-check test for RoPE scaling (huggin…

    …gface#29808)
    
    * add hard rope scaling test
    
    * make fixup
    
    * quick rope scaling tests
    
    * add copy statements
    gante authored Mar 28, 2024
    Configuration menu
    Copy the full SHA
    441de62 View commit details
    Browse the repository at this point in the history
  12. [Mamba] from pretrained issue with self.embeddings (huggingface#2…

    …9851)
    
    * nit
    
    * update
    
    * oups
    
    * Update src/transformers/models/mamba/modeling_mamba.py
    
    Co-authored-by: Lysandre Debut <hi@lysand.re>
    
    ---------
    
    Co-authored-by: Lysandre Debut <hi@lysand.re>
    ArthurZucker and LysandreJik authored Mar 28, 2024
    Configuration menu
    Copy the full SHA
    e677479 View commit details
    Browse the repository at this point in the history
  13. [ TokenizationLlama] fix the way we convert tokens to strings to ke…

    …ep leading spaces 🚨 breaking fix (huggingface#29453)
    
    * nit
    
    * update test and fix test
    
    * fixup
    ArthurZucker authored Mar 28, 2024
    Configuration menu
    Copy the full SHA
    a2a7f71 View commit details
    Browse the repository at this point in the history
  14. Allow GradientAccumulationPlugin to be configured from AcceleratorCon…

    …fig (huggingface#29589)
    
    * add gradient_accumulation_kwargs to AcceleratorConfig
    
    * add suggestions from @muellerzr to docstrings, new behavior and tests
    
    * Documentation suggestions from @muellerz
    
    Co-authored-by: Zach Mueller <muellerzr@gmail.com>
    
    * addressed @muellerzr comments regarding tests and test utils
    
    * moved accelerate version to top of file.
    
    * @muellerzr's variable fix
    
    Co-authored-by: Zach Mueller <muellerzr@gmail.com>
    
    * address @amyeroberts. fix tests and docstrings
    
    * address @amyeroberts additional suggestions
    
    ---------
    
    Co-authored-by: Yu Chin Fabian Lim <flim@sg.ibm.com>
    Co-authored-by: Zach Mueller <muellerzr@gmail.com>
    3 people authored Mar 28, 2024
    Configuration menu
    Copy the full SHA
    4df5b9b View commit details
    Browse the repository at this point in the history
  15. [BC] Fix BC for other libraries (huggingface#29934)

    * fi xbc?
    
    * nit
    ArthurZucker authored Mar 28, 2024
    Configuration menu
    Copy the full SHA
    2bbbf1b View commit details
    Browse the repository at this point in the history
  16. Fix doc issue huggingface#29758 in DebertaV2Config class (huggingface…

    …#29842)
    
    Fix doc issue in DebertaV2Config class
    
    Co-authored-by: Vinayakk Garg <vigar@akamai.com>
    vinayakkgarg and Vinayakk Garg authored Mar 28, 2024
    Configuration menu
    Copy the full SHA
    e203646 View commit details
    Browse the repository at this point in the history
  17. [LlamaSlowConverter] Slow to Fast better support (huggingface#29797)

    * fix
    
    * fix test
    
    * style
    
    * nit
    
    * rather rely on concert token to id
    
    * fix quality
    
    * Update src/transformers/convert_slow_tokenizer.py
    ArthurZucker authored Mar 28, 2024
    Configuration menu
    Copy the full SHA
    536ea2a View commit details
    Browse the repository at this point in the history
  18. Update installs in image classification doc (huggingface#29947)

    Trainer with PyTorch now requires accelerate to be installed.
    
    Partly resolves huggingface#29174
    MariaHei authored Mar 28, 2024
    Configuration menu
    Copy the full SHA
    ba56ed0 View commit details
    Browse the repository at this point in the history

Commits on Mar 29, 2024

  1. Mark test_eager_matches_sdpa_generate flaky for some models (huggin…

    …gface#29479)
    
    * fix
    
    * revert for qwen2
    
    * revert for qwen2
    
    * update
    
    * update
    
    ---------
    
    Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
    ydshieh and ydshieh authored Mar 29, 2024
    Configuration menu
    Copy the full SHA
    43d17c1 View commit details
    Browse the repository at this point in the history
  2. Super tiny fix 12 typos about "with with" (huggingface#29926)

    * with with
    
    * style
    fzyzcjy authored Mar 29, 2024
    Configuration menu
    Copy the full SHA
    5ad7f17 View commit details
    Browse the repository at this point in the history

Commits on Mar 30, 2024

  1. Fix rope theta for OpenLlama (huggingface#29893)

    fix: rope_theta for open llama
    jla524 authored Mar 30, 2024
    Configuration menu
    Copy the full SHA
    6fd93fe View commit details
    Browse the repository at this point in the history
  2. Add warning message for run_qa.py (huggingface#29867)

    * improve: error message for best model metric
    
    * update: raise warning instead of error
    jla524 authored Mar 30, 2024
    Configuration menu
    Copy the full SHA
    156d30d View commit details
    Browse the repository at this point in the history
  3. fix: get mlflow version from mlflow-skinny (huggingface#29918)

    Co-authored-by: Alexander Jipa <azzhipa@amazon.com>
    Alexander Jipa and azzhipa authored Mar 30, 2024
    Configuration menu
    Copy the full SHA
    e644b60 View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    f6701bc View commit details
    Browse the repository at this point in the history
  5. Update model card and link of blog post. (huggingface#29928)

    * Update qwen2_moe.md
    
    * update link of blogpost.
    
    * fixup
    
    ---------
    
    Co-authored-by: bozheng-hit <dsoul0621@gmail.com>
    bozheng-hit and bozheng-hit authored Mar 30, 2024
    Configuration menu
    Copy the full SHA
    46d6368 View commit details
    Browse the repository at this point in the history
  6. [BC] Fix BC for AWQ quant (huggingface#29965)

    fix awq quant
    TechxGenus authored Mar 30, 2024
    Configuration menu
    Copy the full SHA
    6e58407 View commit details
    Browse the repository at this point in the history

Commits on Mar 31, 2024

  1. Rework tests to compare trainer checkpoint args (huggingface#29883)

    * Start rework
    
    * Fix failing test
    
    * Include max
    
    * Update src/transformers/trainer.py
    
    Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
    
    ---------
    
    Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
    muellerzr and ArthurZucker authored Mar 31, 2024
    Configuration menu
    Copy the full SHA
    3b8e293 View commit details
    Browse the repository at this point in the history

Commits on Apr 1, 2024

  1. Fix FA2 tests (huggingface#29909)

    * fix FA2 tests
    
    * refactor inference test name
    ylacombe authored Apr 1, 2024
    Configuration menu
    Copy the full SHA
    569f6c7 View commit details
    Browse the repository at this point in the history
  2. Fix copies main ci (huggingface#29979)

    * fix copies
    
    * nit
    
    * style
    
    * Update utils/check_copies.py
    ArthurZucker authored Apr 1, 2024
    Configuration menu
    Copy the full SHA
    fa2c49b View commit details
    Browse the repository at this point in the history
  3. [tests] fix the wrong output in `ImageToTextPipelineTests.test_condit…

    …ional_generation_llava` (huggingface#29975)
    
    bug fix
    faaany authored Apr 1, 2024
    Configuration menu
    Copy the full SHA
    e4f5b57 View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    c9f6e5e View commit details
    Browse the repository at this point in the history

Commits on Apr 2, 2024

  1. [docs] Big model loading (huggingface#29920)

    * update
    
    * feedback
    stevhliu authored Apr 2, 2024
    Configuration menu
    Copy the full SHA
    096f304 View commit details
    Browse the repository at this point in the history
  2. [generate] fix breaking change for patch (huggingface#29976)

    * fix bug and add tests
    
    * nit
    
    * otherway to get the cur len instead of attention mask
    
    * more places where this might have been broken
    
    * nit
    
    * oups
    
    * inputs_embeds vs input_embeds
    
    * test generated outptus
    
    * style
    
    * nit
    
    * fix
    
    * skip failing biogpt
    ArthurZucker authored Apr 2, 2024
    Configuration menu
    Copy the full SHA
    83b26dd View commit details
    Browse the repository at this point in the history
  3. Fix 29807 sinusoidal positional encodings in Flaubert, Informer and X…

    …LM (huggingface#29904)
    
    * Fix sinusoidal_embeddings in FlaubertModel
    
    * Fix for Informer
    
    * Fix for XLM
    
    * Move sinusoidal emb for XLM
    
    * Move sinusoidal emb for Flaubert
    
    * Small cleanup
    
    * Add comments on tests code copied from
    
    * Add with Distilbert->
    hovnatan authored Apr 2, 2024
    Configuration menu
    Copy the full SHA
    416711c View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    33288ff View commit details
    Browse the repository at this point in the history
  5. Adding FlaxNoRepeatNGramLogitsProcessor (huggingface#29677)

    * fix issue with logit processor in beam search in Flax
    
    * adding FlaxNoRepeatNGramLogitsProcessor class + unit test
    
    * style correction and code verification
    
    * add FlaxNoRepeatNGramLogitsProcessor to the test_processor_list and test_processor_list_jitted tests
    
    * fix an issue where ngrams are banned only if they appear ==1 time + update description of get_previous_ngrams
    
    * replace non-jit compatible masking of ngrams that are not yet generated with jittable version
    
    * Revert "fix issue with logit processor in beam search in Flax"
    
    This reverts commit 09b70d7.
    
    * add FlaxNoRepeatNGramLogitsProcessor to _get_logits_processor
    
    * change the method of casting to boolean of banned tokens indices
    
    * fix code style
    
    * remove some useless operations + significantly faster computation of update indices using jax.lax.fori_loop
    
    * remove useless loop iterations
    
    * set some variables that were calculated and used multiple times
    
    * fix format
    giganttheo authored Apr 2, 2024
    Configuration menu
    Copy the full SHA
    fed27ff View commit details
    Browse the repository at this point in the history
  6. Add Flash Attention 2 support to Musicgen and Musicgen Melody (huggin…

    …gface#29939)
    
    * add FA2 to o.g Musicgen
    
    * make style
    
    * add FA2 support to Musicgen Melody
    
    * add generation FA2 tests to o.g Musicgen
    
    * make style and fix copies
    
    * add Musicgen to FA2 docs + deprecate list
    
    * add sdpa supports to Musicgen's
    
    * make style and fix copies
    
    * refactor attention implementation arguments
    
    * add Copied from to sdpa tests
    
    * add copied form in sdpa tests melody
    
    * add copied for FA2 generation tests
    
    * add FA2 inference copied from
    
    * make style
    ylacombe authored Apr 2, 2024
    Configuration menu
    Copy the full SHA
    0d04b1e View commit details
    Browse the repository at this point in the history
  7. Configuration menu
    Copy the full SHA
    cb5927c View commit details
    Browse the repository at this point in the history
  8. Fix skip_special_tokens for Wav2Vec2CTCTokenizer._decode (hugging…

    …face#29311)
    
    * Fix skip_special_tokens process for Wav2Vec2CTCTokenizer._decode
    
    * Fix skip_special_tokens for Wav2Vec2CTCTokenizer._decode
    
    * Exclude pad_token filtering since it is used as CTC-blank token
    
    * Add small test for skip_special_tokens
    
    * Update decoding test for added new token
    msublee authored Apr 2, 2024
    Configuration menu
    Copy the full SHA
    15cd687 View commit details
    Browse the repository at this point in the history
  9. Hard error when ignoring tensors. (huggingface#27484) (huggingface#29906

    )
    
    * Hard error when ignoring tensors. (huggingface#27484)
    
    * [WIP] Hard error when ignoring tensors.
    
    * Better selection/error when saving a checkpoint.
    
    - Find all names we should normally drop (those are in the transformers
      config)
    - Find all disjoint tensors (for those we can safely trigger a copy to
      get rid of the sharing before saving)
    - Clone those disjoint tensors getting rid of the issue
    - Find all identical names (those should be declared in the config
      but we try to find them all anyway.)
    - For all identical names:
      - If they are in the config, just ignore them everything is fine
      - If they are not, warn about them.
    - For all remainder tensors which are shared yet neither identical NOR
      disjoint. raise a hard error.
    
    * Adding a failing test on `main` that passes here.
    
    * We don't need to keep the subfolder logic in this test.
    
    * Apply suggestions from code review
    
    Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
    
    ---------
    
    Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
    
    * Add small tests.
    
    * Dead variable.
    
    * Fixup.
    
    * Fixing tied_Weights_keys on generic models.
    
    * Fixup + T5 encoder/decoder tying (with different layers)
    
    * Code quality.
    
    * Dynamic member.
    
    * trigger
    
    * Fixing encoder name for other types of encoder/decoder combos.
    
    * Fix scoping.
    
    * Update .github/workflows/self-scheduled.yml
    
    Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
    
    * Fixing the tied_weights after the call.
    
    ---------
    
    Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
    Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
    3 people authored Apr 2, 2024
    Configuration menu
    Copy the full SHA
    9b0a8ea View commit details
    Browse the repository at this point in the history
  10. Generate: fix logits processors doctests (huggingface#29718)

    * fix norm
    
    * fix logits processors doctests
    gante authored Apr 2, 2024
    Configuration menu
    Copy the full SHA
    5080ab1 View commit details
    Browse the repository at this point in the history