-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update training #4
Commits on Mar 22, 2024
-
[
quality
] update quality check to make sure we check imports 😈 (hug……gingface#29771) * update quality check * make it nice * update * let's make sure it runs and we have the logs actually * update workflow * nits
Configuration menu - View commit details
-
Copy full SHA for e68ff30 - Browse repository at this point
Copy the full SHA e68ff30View commit details -
Fix type hint for train_dataset param of Trainer.__init__() to allow …
…IterableDataset. Issue 29678 (huggingface#29738) * Fixed typehint for train_dataset param in Trainer.__init__(). Added IterableDataset option. * make fixup
Configuration menu - View commit details
-
Copy full SHA for 3479161 - Browse repository at this point
Copy the full SHA 3479161View commit details -
Enable AMD docker build CI (huggingface#29803)
* enable amd ci * remove unnecessary clean up
Configuration menu - View commit details
-
Copy full SHA for aa17cf9 - Browse repository at this point
Copy the full SHA aa17cf9View commit details -
Correct llava mask & fix missing setter for
vocab_size
(huggingface……#29389) * correct llava mask * fix vipllava as wlel * mask out embedding for padding tokens * add test * fix style * add setter * fix test on suggestion
Configuration menu - View commit details
-
Copy full SHA for 13b2370 - Browse repository at this point
Copy the full SHA 13b2370View commit details -
rm input dtype change in CPU (huggingface#28631)
* rm input dtype change in CPU * add warning when use CPU low-precision * rm useless logging
Configuration menu - View commit details
-
Copy full SHA for e85654f - Browse repository at this point
Copy the full SHA e85654fView commit details -
Generate: remove unused attributes in
AssistedCandidateGenerator
(h……uggingface#29787) remove unused attrs
Configuration menu - View commit details
-
Copy full SHA for 34e07f4 - Browse repository at this point
Copy the full SHA 34e07f4View commit details -
replaced concatenation to f-strings to improve readability and unify … (
huggingface#29785) replaced concatenation to f-strings to improve readability and unify with the rest code
Configuration menu - View commit details
-
Copy full SHA for 884b221 - Browse repository at this point
Copy the full SHA 884b221View commit details -
Configuration menu - View commit details
-
Copy full SHA for 2e7cb46 - Browse repository at this point
Copy the full SHA 2e7cb46View commit details -
Complete security policy with mentions of remote code (huggingface#29707
) * Security policy * Apply suggestions from code review Co-authored-by: Luc Georges <McPatate@users.noreply.github.com> Co-authored-by: Michelle Habonneau <83347449+Michellehbn@users.noreply.github.com> * Update SECURITY.md Co-authored-by: Diogo Teles Sant'Anna <diogoteles@google.com> --------- Co-authored-by: Luc Georges <McPatate@users.noreply.github.com> Co-authored-by: Michelle Habonneau <83347449+Michellehbn@users.noreply.github.com> Co-authored-by: Diogo Teles Sant'Anna <diogoteles@google.com>
Configuration menu - View commit details
-
Copy full SHA for 7e1413d - Browse repository at this point
Copy the full SHA 7e1413dView commit details -
[
SuperPoint
] Fix doc example (huggingface#29816)[SuperPoint] Fix doc example
Configuration menu - View commit details
-
Copy full SHA for c5f0288 - Browse repository at this point
Copy the full SHA c5f0288View commit details
Commits on Mar 23, 2024
-
[DOCS] Fix typo for llava next docs (huggingface#29829)
Fix typo for llava next docs
Configuration menu - View commit details
-
Copy full SHA for dafe370 - Browse repository at this point
Copy the full SHA dafe370View commit details
Commits on Mar 24, 2024
-
model_summary.md - Restore link to Harvard's Annotated Transformer. (h…
…uggingface#29702) * model_summary.md - Add link to Harvard's Annotated Transformer. * model_summary.md - slight wording change + capitalize name of the paper * model_summary.md - moves the Annotated Transformer link in a praenthesis next to the link to the original paper (great idea, stevhliu!) * model_summary.md - moves the Annotated Transformer link in a praenthesis next to the link to the original paper (commit pt. 2, accidentally removed "has" in pt. 1)
Configuration menu - View commit details
-
Copy full SHA for 76a33a1 - Browse repository at this point
Copy the full SHA 76a33a1View commit details
Commits on Mar 25, 2024
-
Remove static pretrained maps from the library's internals (huggingfa…
…ce#29112) * [test_all] Remove static pretrained maps from the library's internals * Deprecate archive maps instead of removing them * Revert init changes * [test_all] Deprecate instead of removing * [test_all] PVT v2 support * [test_all] Tests should all pass * [test_all] Style * Address review comments * Update src/transformers/models/deprecated/_archive_maps.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/models/deprecated/_archive_maps.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * [test_all] trigger tests * [test_all] LLAVA * [test_all] Bad rebase --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
Configuration menu - View commit details
-
Copy full SHA for 39114c0 - Browse repository at this point
Copy the full SHA 39114c0View commit details -
Fix the behavior of collecting 'num_input_tokens_seen' (huggingface#2…
…9099) fix the behavior of collecting 'num_input_tokens_seen' See huggingface#28791 for more details.
Configuration menu - View commit details
-
Copy full SHA for afe73ae - Browse repository at this point
Copy the full SHA afe73aeView commit details -
Populate torch_dtype from model to pipeline (huggingface#28940)
* Populate torch_dtype from model to pipeline Signed-off-by: B-Step62 <yuki.watanabe@databricks.com> * use property Signed-off-by: B-Step62 <yuki.watanabe@databricks.com> * lint Signed-off-by: B-Step62 <yuki.watanabe@databricks.com> * Remove default handling Signed-off-by: B-Step62 <yuki.watanabe@databricks.com> --------- Signed-off-by: B-Step62 <yuki.watanabe@databricks.com>
Configuration menu - View commit details
-
Copy full SHA for 8e9a220 - Browse repository at this point
Copy the full SHA 8e9a220View commit details -
Configuration menu - View commit details
-
Copy full SHA for 00a09ed - Browse repository at this point
Copy the full SHA 00a09edView commit details -
Configuration menu - View commit details
-
Copy full SHA for e3e16dd - Browse repository at this point
Copy the full SHA e3e16ddView commit details -
remove quotes in code example (huggingface#29812)
Co-authored-by: Johannes <johannes.kolbe@tech.better.team>
Configuration menu - View commit details
-
Copy full SHA for 7eb3ba8 - Browse repository at this point
Copy the full SHA 7eb3ba8View commit details
Commits on Mar 26, 2024
-
Add warnings if training args differ from checkpoint trainer state (h…
…uggingface#29255) * add warnings if training args differ from checkpoint args stored in trainer_state.json * run formatting and styling * add a test * format and styling --------- Co-authored-by: Jonathan Flynn <jonl.flynn@guardian.co.uk>
Configuration menu - View commit details
-
Copy full SHA for b5a6d6e - Browse repository at this point
Copy the full SHA b5a6d6eView commit details -
Replace 'decord' with 'av' in VideoClassificationPipeline (huggingfac…
…e#29747) * replace the 'decord' with 'av' in VideoClassificationPipeline * fix the check of backend in VideoClassificationPipeline * adjust the order of imports * format 'video_classification.py' * format 'video_classification.py' with ruff --------- Co-authored-by: wanqiancheng <13541261013@163.com>
Configuration menu - View commit details
-
Copy full SHA for b32bf85 - Browse repository at this point
Copy the full SHA b32bf85View commit details -
Fix header in IFE task guide (huggingface#29859)
Update image_feature_extraction.md
Configuration menu - View commit details
-
Copy full SHA for de81a67 - Browse repository at this point
Copy the full SHA de81a67View commit details -
Configuration menu - View commit details
-
Copy full SHA for b9ceb03 - Browse repository at this point
Copy the full SHA b9ceb03View commit details -
Allow
bos_token_id is None
during the generation with `inputs_embed……s` (huggingface#29772) * update * add ut * update
Configuration menu - View commit details
-
Copy full SHA for 998b5bb - Browse repository at this point
Copy the full SHA 998b5bbView commit details -
Add
cosine_with_min_lr
scheduler in Trainer (huggingface#29341)* Add cosine_with_min_lr scheduler * Update error message for missing min_lr or min_lr_rate
Configuration menu - View commit details
-
Copy full SHA for ef60995 - Browse repository at this point
Copy the full SHA ef60995View commit details -
Disable AMD memory benchmarks (huggingface#29871)
* remove py3nvml to skip amd memory benchmarks * uninstall pynvml from docker images
Configuration menu - View commit details
-
Copy full SHA for 07d7952 - Browse repository at this point
Copy the full SHA 07d7952View commit details -
Configuration menu - View commit details
-
Copy full SHA for f01e160 - Browse repository at this point
Copy the full SHA f01e160View commit details
Commits on Mar 27, 2024
-
Support
num_attention_heads
!=num_key_value_heads
in Flax Llama ……Implementation (huggingface#29557) * fix tinyllama flax modelling * rename vars to minimize changes * move * formatting * remove unused var
Configuration menu - View commit details
-
Copy full SHA for 8e08aca - Browse repository at this point
Copy the full SHA 8e08acaView commit details -
Add Qwen2MoE (huggingface#29377)
* add support for qwen2 MoE models * update docs * add support for qwen2 MoE models * update docs * update model name & test * update readme * update class names & readme & model_doc of Qwen2MoE. * update architecture name * fix qwen2_moe tests * use Qwen2Tokenizer instead of Qwen2MoeTokenizer * update modeling_qwen2_moe.py * fix model architecture * fix qwen2_moe tests * use Qwen2Tokenizer instead of Qwen2MoeTokenizer * update modeling_qwen2_moe.py * fix model architecture * fix style * fix test when there are sparse and non sparse layers * fixup * Update README.md Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * fixup * fixup * add archive back * add support for qwen2 MoE models * update docs * update model name & test * update readme * update class names & readme & model_doc of Qwen2MoE. * update architecture name * fix qwen2_moe tests * use Qwen2Tokenizer instead of Qwen2MoeTokenizer * update modeling_qwen2_moe.py * fix model architecture * fixup * fix qwen2_moe tests * use Qwen2Tokenizer instead of Qwen2MoeTokenizer * fix style * fix test when there are sparse and non sparse layers * fixup * add archive back * fix integration test * fixup --------- Co-authored-by: bozheng-hit <dsoul0621@gmail.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
Configuration menu - View commit details
-
Copy full SHA for 1c39974 - Browse repository at this point
Copy the full SHA 1c39974View commit details -
Mamba
slow_forward
gradient fix (huggingface#29563)* FIX: Cached slow forward in mamba - additionally added mamba cached test - added unused test (mamba causal lm forward and backward) - fixed typo: "causl" --> "causal" * formatting * fix: use real `slow_forward` call instead of torch module's * add shape assertion for mixer block test * adjust shape assertion
Configuration menu - View commit details
-
Copy full SHA for cefb819 - Browse repository at this point
Copy the full SHA cefb819View commit details -
Fix 29807, sinusoidal positional encodings overwritten by post_init() (…
…huggingface#29813) * Check for requires_grad when initing weights * Add unit test * Move sinusoidal positional encoding generation after post_init() * Add modules to skip init list * Move create_sinusoidal_embeddings to _init_weights
Configuration menu - View commit details
-
Copy full SHA for a81cf9e - Browse repository at this point
Copy the full SHA a81cf9eView commit details -
Reimplement "Automatic safetensors conversion when lacking these file…
…s" (huggingface#29846) * Automatic safetensors conversion when lacking these files (huggingface#29390) * Automatic safetensors conversion when lacking these files * Remove debug * Thread name * Typo * Ensure that raises do not affect the main thread * Catch all errors
Configuration menu - View commit details
-
Copy full SHA for 4d8427f - Browse repository at this point
Copy the full SHA 4d8427fView commit details -
Configuration menu - View commit details
-
Copy full SHA for 31c575b - Browse repository at this point
Copy the full SHA 31c575bView commit details -
Move
eos_token_id
to stopping criteria (huggingface#29459)* add eos stopping criteria * minor fix * Update tests/generation/test_stopping_criteria.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * check eos is not None and fix tests * make style and fixup * Update src/transformers/generation/stopping_criteria.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Update tests/generation/test_utils.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Update tests/generation/test_utils.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Update src/transformers/generation/__init__.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/generation/stopping_criteria.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/generation/stopping_criteria.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/generation/stopping_criteria.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * camel case everywhere * call stopping criteria list for candidate ids * make style and fixup * Empty commit * Empty commit to pass flaky test * set max length in PromptLookupCandidateGenerator * Update src/transformers/generation/utils.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * lets fix this typo in docs * Update src/transformers/generation/utils.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/generation/utils.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * update PR * empty commit --------- Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
Configuration menu - View commit details
-
Copy full SHA for 0efcf32 - Browse repository at this point
Copy the full SHA 0efcf32View commit details -
add Cambricon MLUs support (huggingface#29627)
* add Cambricon MLUs support * fix mlu device rng state * up for quality check * up mlu to support fp16 * fix mlu device dependency error * fix mlu device dependency error * enable mlu device for bf16 * fix mlu device memory tracker
Configuration menu - View commit details
-
Copy full SHA for 7576974 - Browse repository at this point
Copy the full SHA 7576974View commit details -
MixtralSparseMoeBlock: add gate jitter (huggingface#29865)
This commit adds gate jitter to MixtralSparseMoeBlock's input data before passing it through the MoE layer, if turned on.
Configuration menu - View commit details
-
Copy full SHA for a25037b - Browse repository at this point
Copy the full SHA a25037bView commit details
Commits on Mar 28, 2024
-
Configuration menu - View commit details
-
Copy full SHA for d9dc993 - Browse repository at this point
Copy the full SHA d9dc993View commit details -
[
make fix-copies
] update and help (huggingface#29924)* add some help * style
Configuration menu - View commit details
-
Copy full SHA for b256516 - Browse repository at this point
Copy the full SHA b256516View commit details -
[
GptNeox
] don't gather on pkv when using the trainer (huggingface#2……9892) don't gather on pkv when using the trainer
Configuration menu - View commit details
-
Copy full SHA for 543889f - Browse repository at this point
Copy the full SHA 543889fView commit details -
[
pipeline
]. Zero shot add doc warning (huggingface#29845)* add doc warning * fix build pr
Configuration menu - View commit details
-
Copy full SHA for 3a7e683 - Browse repository at this point
Copy the full SHA 3a7e683View commit details -
Adding Flash Attention 2 Support for GPT2 (huggingface#29226)
* First commit to add flash attention 2 for GPT-2 * more improvements * Make GPT2 pass tests and fixed Decison Transformers copies * Fixed missing arg * fix copies * Added expected speedup * Update src/transformers/models/gpt2/modeling_gpt2.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/models/gpt2/modeling_gpt2.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/models/gpt2/modeling_gpt2.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Added test * Fixed attn attribute * Update docs/source/en/model_doc/gpt2.md Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update docs/source/en/model_doc/gpt2.md Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update Decision transformer attentions * More updates * Passing tests * Fix copies * Fix copies part 2 * Decision transformer updates * Update src/transformers/models/gpt2/modeling_gpt2.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Fix copies * Decision transformer not supporting flash attn * Addressed comments * Addressed comments * Addressed comments --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
Configuration menu - View commit details
-
Copy full SHA for 22d159d - Browse repository at this point
Copy the full SHA 22d159dView commit details -
Configuration menu - View commit details
-
Copy full SHA for 7c19faf - Browse repository at this point
Copy the full SHA 7c19fafView commit details -
Tests: replace
torch.testing.assert_allclose
by `torch.testing.asse……rt_close` (huggingface#29915) * replace torch.testing.assert_allclose by torch.testing.assert_close * missing atol rtol
Configuration menu - View commit details
-
Copy full SHA for 248d5d2 - Browse repository at this point
Copy the full SHA 248d5d2View commit details -
Configuration menu - View commit details
-
Copy full SHA for c9d2e85 - Browse repository at this point
Copy the full SHA c9d2e85View commit details -
Safe import of LRScheduler (huggingface#29919)
* Safe import of LRScheduler * Update src/transformers/trainer_pt_utils.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/trainer_pt_utils.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Fix up --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
Configuration menu - View commit details
-
Copy full SHA for 855b95c - Browse repository at this point
Copy the full SHA 855b95cView commit details -
add functions to inspect model and optimizer status to trainer.py (hu…
…ggingface#29838) * add functions to get number of params which require grad, get optimizer group for parameters and get learning rates of param groups to trainer.py * add tests and raise ValueError when optimizer is None * add second layer to test and freeze its weigths * check if torch is available before running tests * use decorator to check if torch is available Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * fix test indentation Co-authored-by: Zach Mueller <muellerzr@gmail.com> --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> Co-authored-by: Zach Mueller <muellerzr@gmail.com>
Configuration menu - View commit details
-
Copy full SHA for aac7099 - Browse repository at this point
Copy the full SHA aac7099View commit details -
RoPE models: add numerical sanity-check test for RoPE scaling (huggin…
…gface#29808) * add hard rope scaling test * make fixup * quick rope scaling tests * add copy statements
Configuration menu - View commit details
-
Copy full SHA for 441de62 - Browse repository at this point
Copy the full SHA 441de62View commit details -
[
Mamba
] from pretrained issue withself.embeddings
(huggingface#2……9851) * nit * update * oups * Update src/transformers/models/mamba/modeling_mamba.py Co-authored-by: Lysandre Debut <hi@lysand.re> --------- Co-authored-by: Lysandre Debut <hi@lysand.re>
Configuration menu - View commit details
-
Copy full SHA for e677479 - Browse repository at this point
Copy the full SHA e677479View commit details -
[
TokenizationLlama
] fix the way we convert tokens to strings to ke……ep leading spaces 🚨 breaking fix (huggingface#29453) * nit * update test and fix test * fixup
Configuration menu - View commit details
-
Copy full SHA for a2a7f71 - Browse repository at this point
Copy the full SHA a2a7f71View commit details -
Allow GradientAccumulationPlugin to be configured from AcceleratorCon…
…fig (huggingface#29589) * add gradient_accumulation_kwargs to AcceleratorConfig * add suggestions from @muellerzr to docstrings, new behavior and tests * Documentation suggestions from @muellerz Co-authored-by: Zach Mueller <muellerzr@gmail.com> * addressed @muellerzr comments regarding tests and test utils * moved accelerate version to top of file. * @muellerzr's variable fix Co-authored-by: Zach Mueller <muellerzr@gmail.com> * address @amyeroberts. fix tests and docstrings * address @amyeroberts additional suggestions --------- Co-authored-by: Yu Chin Fabian Lim <flim@sg.ibm.com> Co-authored-by: Zach Mueller <muellerzr@gmail.com>
Configuration menu - View commit details
-
Copy full SHA for 4df5b9b - Browse repository at this point
Copy the full SHA 4df5b9bView commit details -
[
BC
] Fix BC for other libraries (huggingface#29934)* fi xbc? * nit
Configuration menu - View commit details
-
Copy full SHA for 2bbbf1b - Browse repository at this point
Copy the full SHA 2bbbf1bView commit details -
Fix doc issue huggingface#29758 in DebertaV2Config class (huggingface…
…#29842) Fix doc issue in DebertaV2Config class Co-authored-by: Vinayakk Garg <vigar@akamai.com>
Configuration menu - View commit details
-
Copy full SHA for e203646 - Browse repository at this point
Copy the full SHA e203646View commit details -
[
LlamaSlowConverter
] Slow to Fast better support (huggingface#29797)* fix * fix test * style * nit * rather rely on concert token to id * fix quality * Update src/transformers/convert_slow_tokenizer.py
Configuration menu - View commit details
-
Copy full SHA for 536ea2a - Browse repository at this point
Copy the full SHA 536ea2aView commit details -
Update installs in image classification doc (huggingface#29947)
Trainer with PyTorch now requires accelerate to be installed. Partly resolves huggingface#29174
Configuration menu - View commit details
-
Copy full SHA for ba56ed0 - Browse repository at this point
Copy the full SHA ba56ed0View commit details
Commits on Mar 29, 2024
-
Mark
test_eager_matches_sdpa_generate
flaky for some models (huggin……gface#29479) * fix * revert for qwen2 * revert for qwen2 * update * update --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Configuration menu - View commit details
-
Copy full SHA for 43d17c1 - Browse repository at this point
Copy the full SHA 43d17c1View commit details -
Super tiny fix 12 typos about "with with" (huggingface#29926)
* with with * style
Configuration menu - View commit details
-
Copy full SHA for 5ad7f17 - Browse repository at this point
Copy the full SHA 5ad7f17View commit details
Commits on Mar 30, 2024
-
Fix rope theta for OpenLlama (huggingface#29893)
fix: rope_theta for open llama
Configuration menu - View commit details
-
Copy full SHA for 6fd93fe - Browse repository at this point
Copy the full SHA 6fd93feView commit details -
Add warning message for
run_qa.py
(huggingface#29867)* improve: error message for best model metric * update: raise warning instead of error
Configuration menu - View commit details
-
Copy full SHA for 156d30d - Browse repository at this point
Copy the full SHA 156d30dView commit details -
fix: get mlflow version from mlflow-skinny (huggingface#29918)
Co-authored-by: Alexander Jipa <azzhipa@amazon.com>
Configuration menu - View commit details
-
Copy full SHA for e644b60 - Browse repository at this point
Copy the full SHA e644b60View commit details -
Configuration menu - View commit details
-
Copy full SHA for f6701bc - Browse repository at this point
Copy the full SHA f6701bcView commit details -
Update model card and link of blog post. (huggingface#29928)
* Update qwen2_moe.md * update link of blogpost. * fixup --------- Co-authored-by: bozheng-hit <dsoul0621@gmail.com>
Configuration menu - View commit details
-
Copy full SHA for 46d6368 - Browse repository at this point
Copy the full SHA 46d6368View commit details -
Configuration menu - View commit details
-
Copy full SHA for 6e58407 - Browse repository at this point
Copy the full SHA 6e58407View commit details
Commits on Mar 31, 2024
-
Rework tests to compare trainer checkpoint args (huggingface#29883)
* Start rework * Fix failing test * Include max * Update src/transformers/trainer.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
Configuration menu - View commit details
-
Copy full SHA for 3b8e293 - Browse repository at this point
Copy the full SHA 3b8e293View commit details
Commits on Apr 1, 2024
-
Fix FA2 tests (huggingface#29909)
* fix FA2 tests * refactor inference test name
Configuration menu - View commit details
-
Copy full SHA for 569f6c7 - Browse repository at this point
Copy the full SHA 569f6c7View commit details -
Fix copies main ci (huggingface#29979)
* fix copies * nit * style * Update utils/check_copies.py
Configuration menu - View commit details
-
Copy full SHA for fa2c49b - Browse repository at this point
Copy the full SHA fa2c49bView commit details -
[tests] fix the wrong output in `ImageToTextPipelineTests.test_condit…
…ional_generation_llava` (huggingface#29975) bug fix
Configuration menu - View commit details
-
Copy full SHA for e4f5b57 - Browse repository at this point
Copy the full SHA e4f5b57View commit details -
Configuration menu - View commit details
-
Copy full SHA for c9f6e5e - Browse repository at this point
Copy the full SHA c9f6e5eView commit details
Commits on Apr 2, 2024
-
Configuration menu - View commit details
-
Copy full SHA for 096f304 - Browse repository at this point
Copy the full SHA 096f304View commit details -
[
generate
] fix breaking change for patch (huggingface#29976)* fix bug and add tests * nit * otherway to get the cur len instead of attention mask * more places where this might have been broken * nit * oups * inputs_embeds vs input_embeds * test generated outptus * style * nit * fix * skip failing biogpt
Configuration menu - View commit details
-
Copy full SHA for 83b26dd - Browse repository at this point
Copy the full SHA 83b26ddView commit details -
Fix 29807 sinusoidal positional encodings in Flaubert, Informer and X…
…LM (huggingface#29904) * Fix sinusoidal_embeddings in FlaubertModel * Fix for Informer * Fix for XLM * Move sinusoidal emb for XLM * Move sinusoidal emb for Flaubert * Small cleanup * Add comments on tests code copied from * Add with Distilbert->
Configuration menu - View commit details
-
Copy full SHA for 416711c - Browse repository at this point
Copy the full SHA 416711cView commit details -
Configuration menu - View commit details
-
Copy full SHA for 33288ff - Browse repository at this point
Copy the full SHA 33288ffView commit details -
Adding FlaxNoRepeatNGramLogitsProcessor (huggingface#29677)
* fix issue with logit processor in beam search in Flax * adding FlaxNoRepeatNGramLogitsProcessor class + unit test * style correction and code verification * add FlaxNoRepeatNGramLogitsProcessor to the test_processor_list and test_processor_list_jitted tests * fix an issue where ngrams are banned only if they appear ==1 time + update description of get_previous_ngrams * replace non-jit compatible masking of ngrams that are not yet generated with jittable version * Revert "fix issue with logit processor in beam search in Flax" This reverts commit 09b70d7. * add FlaxNoRepeatNGramLogitsProcessor to _get_logits_processor * change the method of casting to boolean of banned tokens indices * fix code style * remove some useless operations + significantly faster computation of update indices using jax.lax.fori_loop * remove useless loop iterations * set some variables that were calculated and used multiple times * fix format
Configuration menu - View commit details
-
Copy full SHA for fed27ff - Browse repository at this point
Copy the full SHA fed27ffView commit details -
Add Flash Attention 2 support to Musicgen and Musicgen Melody (huggin…
…gface#29939) * add FA2 to o.g Musicgen * make style * add FA2 support to Musicgen Melody * add generation FA2 tests to o.g Musicgen * make style and fix copies * add Musicgen to FA2 docs + deprecate list * add sdpa supports to Musicgen's * make style and fix copies * refactor attention implementation arguments * add Copied from to sdpa tests * add copied form in sdpa tests melody * add copied for FA2 generation tests * add FA2 inference copied from * make style
Configuration menu - View commit details
-
Copy full SHA for 0d04b1e - Browse repository at this point
Copy the full SHA 0d04b1eView commit details -
Configuration menu - View commit details
-
Copy full SHA for cb5927c - Browse repository at this point
Copy the full SHA cb5927cView commit details -
Fix
skip_special_tokens
forWav2Vec2CTCTokenizer._decode
(hugging……face#29311) * Fix skip_special_tokens process for Wav2Vec2CTCTokenizer._decode * Fix skip_special_tokens for Wav2Vec2CTCTokenizer._decode * Exclude pad_token filtering since it is used as CTC-blank token * Add small test for skip_special_tokens * Update decoding test for added new token
Configuration menu - View commit details
-
Copy full SHA for 15cd687 - Browse repository at this point
Copy the full SHA 15cd687View commit details -
Hard error when ignoring tensors. (huggingface#27484) (huggingface#29906
) * Hard error when ignoring tensors. (huggingface#27484) * [WIP] Hard error when ignoring tensors. * Better selection/error when saving a checkpoint. - Find all names we should normally drop (those are in the transformers config) - Find all disjoint tensors (for those we can safely trigger a copy to get rid of the sharing before saving) - Clone those disjoint tensors getting rid of the issue - Find all identical names (those should be declared in the config but we try to find them all anyway.) - For all identical names: - If they are in the config, just ignore them everything is fine - If they are not, warn about them. - For all remainder tensors which are shared yet neither identical NOR disjoint. raise a hard error. * Adding a failing test on `main` that passes here. * We don't need to keep the subfolder logic in this test. * Apply suggestions from code review Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Add small tests. * Dead variable. * Fixup. * Fixing tied_Weights_keys on generic models. * Fixup + T5 encoder/decoder tying (with different layers) * Code quality. * Dynamic member. * trigger * Fixing encoder name for other types of encoder/decoder combos. * Fix scoping. * Update .github/workflows/self-scheduled.yml Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Fixing the tied_weights after the call. --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Configuration menu - View commit details
-
Copy full SHA for 9b0a8ea - Browse repository at this point
Copy the full SHA 9b0a8eaView commit details -
Generate: fix logits processors doctests (huggingface#29718)
* fix norm * fix logits processors doctests
Configuration menu - View commit details
-
Copy full SHA for 5080ab1 - Browse repository at this point
Copy the full SHA 5080ab1View commit details