Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix pr 32013 #2

Closed
wants to merge 350 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
350 commits
Select commit Hold shift + click to select a range
165116b
Remove conversational pipeline tests (#32099)
amyeroberts Jul 24, 2024
e0182f3
RoPE: relaxed rope validation (#32182)
gante Jul 24, 2024
8d2534c
let's not warn when someone is running a forward (#32176)
ArthurZucker Jul 24, 2024
1392a68
Fix resize embedding with Deepspeed (#32192)
zucchini-nlp Jul 24, 2024
af0e4b7
Fix float8_e4m3fn in modeling_utils (#32193)
SunMarc Jul 24, 2024
1c122a4
Support dequantizing GGUF FP16 format (#31783)
PenutChen Jul 24, 2024
edd68f4
:rotating_light: No more default chat templates (#31733)
Rocketknight1 Jul 24, 2024
85a1269
fix: Replaced deprecated `unittest method` with the correct one (#32198)
Sai-Suraj-27 Jul 24, 2024
5658e74
[whisper] fix short-form output type (#32178)
sanchit-gandhi Jul 25, 2024
f53a5de
remove unnecessary guard code related with pytorch versions 1.4.2 ~ 1…
statelesshz Jul 25, 2024
1ecedf1
Update question_answering.py (#32208)
avlewis Jul 25, 2024
9b9a54e
[BigBird Pegasus] set _supports_param_buffer_assignment to False (#32…
kashif Jul 25, 2024
de23188
[warnings] fix E721 warnings (#32223)
kashif Jul 25, 2024
df6eee9
Follow up for #31973 (#32025)
ydshieh Jul 25, 2024
6ed0bf1
translate philosophy.md to chinese (#32177)
statelesshz Jul 25, 2024
3a83ec4
Allow a specific microphone to be used by the ffmpeg audio pipeline u…
jrhe Jul 25, 2024
9d6c064
Fix code snippet for Grounding DINO (#32229)
qubvel Jul 25, 2024
4ab33c2
Generation: stop at `eos` for assisted decoding (#31301)
zucchini-nlp Jul 26, 2024
fad15fb
Llava: generate without images (#32183)
zucchini-nlp Jul 26, 2024
c46edfb
Resize embeds with DeepSpeed (#32214)
zucchini-nlp Jul 26, 2024
1c7ebf1
don't log base model architecture in wandb if log model is false (#32…
joaonadkarni Jul 26, 2024
b8e5cd5
Refactor: Removed un-necessary `object` base class (#32230)
Sai-Suraj-27 Jul 26, 2024
f9756d9
Adds: extra_repr for RMSNorm layers in most models (#32204)
rohitdwivedula Jul 26, 2024
5f841c7
Add check for `target_sizes is None` in `post_process_image_guided_de…
catalys1 Jul 26, 2024
27c7f97
[tests] fix `static` cache implementation is not compatible with `att…
faaany Jul 26, 2024
81233c0
Flash-Attn: fix generation when no attention mask or no pading (#32241)
zucchini-nlp Jul 26, 2024
8da9068
More flexible trigger condition (#32251)
ydshieh Jul 26, 2024
44f6fdd
Llama 3.1: replace for loop by tensor ops at inv_freq initialization …
gante Jul 27, 2024
f739687
🚨 Bloom support for cache class (#31445)
zucchini-nlp Jul 29, 2024
f2122cc
Upload new model failure report to Hub (#32264)
ydshieh Jul 29, 2024
5019aab
Optimize t5 tokenize logic to avoid redundant calls (#32270)
leejet Jul 29, 2024
a2ad9d5
fix: Fixed wrong argument passed to `convert_blip_checkpoint` functio…
Sai-Suraj-27 Jul 29, 2024
535fe78
Repo: remove exceptions in `check_docstrings` (#32259)
gante Jul 29, 2024
6494479
make `p_mask` a numpy array before passing to `select_starts_ends` (#…
faaany Jul 29, 2024
4992889
fix(docs): Fixed a link in docs (#32274)
Sai-Suraj-27 Jul 29, 2024
7ffe25f
Generate: end-to-end compilation (#30788)
gante Jul 29, 2024
3fbaaaa
Whisper tokenizer word level timestamps (#32197)
kamilakesbi Jul 29, 2024
7f5d644
[pipeline] fix padding for 1-d tensors (#31776)
sanchit-gandhi Jul 29, 2024
811a9ca
Make static cache compatible with torch.export (#32168)
guangy10 Jul 29, 2024
a24a9a6
Add stream messages from agent run for gradio chatbot (#32142)
aymeric-roucher Jul 29, 2024
f0bc49e
use torch 2.4 in 2 CI jobs (#32302)
ydshieh Jul 29, 2024
3e8106d
Docs: fix GaLore optimizer code example (#32249)
gil2rok Jul 30, 2024
934fe15
Fix GGUF dequantize for `gguf==0.9.1` (#32298)
Isotr0py Jul 30, 2024
20528f0
Cast epochs_trained to int when resuming training (#32286)
teddy-f-47 Jul 30, 2024
084b509
feat(ci): set `fetch-depth: 0` in trufflehog checkout step (#31663)
McPatate Jul 30, 2024
2fbbcf5
Fix M4T for ASR pipeline (#32296)
ylacombe Jul 30, 2024
e68ec18
Docs: formatting nits (#32247)
gante Jul 30, 2024
bd54ed2
Alternative agent plan (#32295)
plaggy Jul 30, 2024
1627108
fix: Added missing raise keyword for few exceptions (#32333)
Sai-Suraj-27 Jul 30, 2024
62c60a3
fixes to properly shard FSDP across cpu and meta for cpu_efficient_lo…
winglian Jul 30, 2024
516af4b
fixes #32329 : The Torch code is correct - to get an average of 10% o…
fkrasnov2 Jul 30, 2024
026a173
Repo checks: skip docstring checks if not in the diff (#32328)
gante Jul 30, 2024
6e2d04e
Fix slow GemmaTokenizer and improve SPM slow -> fast conversion proce…
xenova Jul 30, 2024
a326433
LLaVA-NeXT: fix anyres shapes (#32314)
zucchini-nlp Jul 31, 2024
7f552e2
Gemma2 and flash-attention (#32188)
zucchini-nlp Jul 31, 2024
b75ad56
Llama 3.1: Fix incorrect `inv_freq` assignment (#32330)
gante Jul 31, 2024
5f1fcc2
[Idefics2] - Fix FA2 call for Perceiver layer (#32275)
amyeroberts Jul 31, 2024
ef177a5
Gemma 2: support assisted generation (#32357)
gante Jul 31, 2024
b46bd8b
Fix error when streaming to gradio with non-string tool arguments (#3…
aymeric-roucher Jul 31, 2024
92abe60
>3-5x faster torch.compile forward compilation for autoregressive dec…
fxmarty Jul 31, 2024
53f0c9c
fix: Removed unnecessary `@staticmethod` decorator (#32361)
Sai-Suraj-27 Jul 31, 2024
14ee232
fix: warmup_steps check for training_args (#32236)
Ricardo-L-C Jul 31, 2024
453e748
LLaVa: add cache class attribute (#32278)
zucchini-nlp Aug 1, 2024
9451a38
[enc-dec cache] fix bug in indexing (#32370)
sanchit-gandhi Aug 1, 2024
e234061
[whisper] compile compatibility with long-form decoding (#31772)
sanchit-gandhi Aug 1, 2024
48ed24c
Remove size check between attn_weights and kv_seq_len for phi3 (#32339)
helunwencser Aug 1, 2024
9e28284
add missing attribute _supports_param_buffer_assignment for gpt-j. (#…
nv-guomingz Aug 1, 2024
05c1f9a
Check device map for saving tokenizer config on TPU (fix for issue #3…
ayukh Aug 1, 2024
2229ebe
update clean_up_tokenization_spaces warning (#32371)
itazap Aug 1, 2024
db8c7ca
Empty list in defaults for LLaMA special tokens during weights conver…
ViktorooReps Aug 1, 2024
b4727a1
Fix conflicting key in init kwargs in PreTrainedTokenizerBase (#31233)
OmarManzoor Aug 1, 2024
ca59d6f
Offloaded KV Cache (#31325)
n17s Aug 1, 2024
e3d8285
Docker: add `speech` dep to the consistency docker image (#32374)
gante Aug 1, 2024
51ab25e
Fixed Hybrid Cache Shape Initialization. (#32163)
OsamaS99 Aug 1, 2024
82efc53
Yell at the user if zero-3 init wasn't performed, but expected to hav…
muellerzr Aug 1, 2024
2af199c
Update docs (#32368)
zucchini-nlp Aug 2, 2024
083e13b
RoPE: Add numerical tests ✨ (#32380)
gante Aug 2, 2024
c1aa0ed
[generate] only require an attention mask for mps with torch<2.4 (#32…
sanchit-gandhi Aug 2, 2024
7c31d05
fix: (issue #32124) Exception raised when running `transformers/examp…
fshp971 Aug 3, 2024
621fb3c
MixtralFlashAttention2: put "plus 1" inside parentheses when calculat…
xenshinu Aug 3, 2024
847bb85
Bump keras from 2.8.0 to 2.13.1 in /examples/research_projects/decisi…
dependabot[bot] Aug 5, 2024
05ae3a3
fix: SeamlessM4TFeatureExtractor stride remainder (#32088)
TechInterMezzo Aug 5, 2024
3bb646a
Phi3 tests: fix typing for Python 3.8 (#32388)
zucchini-nlp Aug 5, 2024
3d7c2f9
#32184 save total_vocab_size (#32240)
itazap Aug 5, 2024
ea5da52
add values for neftune (#32399)
nbroad1881 Aug 5, 2024
f5f1e52
Fix documentation references to google/bit-50 model (#32407)
JuanFKurucz Aug 5, 2024
baf7e5c
Persist embedding type of BART and mBART models after resize (#32242)
AbdiHaryadi Aug 5, 2024
458b0cd
fix: Updated `test_embeded_special_tokens` for luke and mluke models …
Sai-Suraj-27 Aug 5, 2024
7e5d46d
Respect the config's attn_implementation if set (#32383)
amyeroberts Aug 5, 2024
13dc6b0
Fix documentation links and code reference to model llava-next (#32434)
JuanFKurucz Aug 5, 2024
37c5ca5
Cache: create docs (#32150)
zucchini-nlp Aug 6, 2024
0aa8328
Llava: fix checkpoint_doc (#32458)
RUFFY-369 Aug 6, 2024
e85d863
add the missing flash attention test marker (#32419)
faaany Aug 6, 2024
fb66ef8
Update kwargs validation for `preprocess` with decorator (#32024)
qubvel Aug 6, 2024
438d06c
Fix get large model config for Switch Transformer encoder only tester…
JuanFKurucz Aug 6, 2024
36fd35e
Dependencies: fix typo (#32389)
gante Aug 6, 2024
6a03942
Add Nemotron HF Support (#31699)
suiyoubi Aug 6, 2024
3d8bd11
Generate: fix end to end compilation (#32465)
gante Aug 6, 2024
80b90e7
Add codestral mamba2 (#32080)
molbap Aug 6, 2024
194cf1f
Migrate import checks not need accelerate, and be more clear on min v…
muellerzr Aug 6, 2024
50c3ba8
Documentation: BOS token_id deprecation change for NLLB (#32443)
christoukmaji Aug 6, 2024
26a9443
dev version 4.45.0
ArthurZucker Aug 6, 2024
4fdc702
`is_torchdynamo_compiling` -- cast a wide exception net (#32476)
gante Aug 6, 2024
ac2707e
Revert "fixes to properly shard FSDP across cpu and meta for cpu_effc…
matthewdouglas Aug 6, 2024
5301b98
🌐 [i18n-KO] Translated `mask_generation.md` to Korean (#32257)
jeongiin Aug 6, 2024
3b193c7
🌐 [i18n-KO] Translated `idefics.md` to Korean (#32258)
boyunJang Aug 6, 2024
6af0854
🌐 [i18n-KO] Translated `image_to_image.md` to Korean (#32327)
shinhyunji36 Aug 6, 2024
a30c865
Cache: new Cache format in decoder-only models (#31421)
zucchini-nlp Aug 7, 2024
7ad784a
Gemma2: add cache warning (#32279)
zucchini-nlp Aug 7, 2024
46d09af
enable xla fsdp (#32048)
hanwen-sun Aug 7, 2024
c54a6f9
Fix typo in tokenization_utils_base.py (#32484)
blubitz Aug 7, 2024
e0d8253
Agents use grammar (#31735)
aymeric-roucher Aug 7, 2024
b640103
fix broken link in docs (#32491)
jorahn Aug 7, 2024
b7fb393
Docs: alert for the possibility of manipulating logits (#32467)
gante Aug 7, 2024
1124d95
🌐 [i18n-KO] Translated `gptq.md` to Korean (#32293)
1kmmk1 Aug 7, 2024
fcc4f2a
🌐 [i18n-KO] Translated `prompting.md` to Korean (#32294)
chhaewxn Aug 7, 2024
fa59fd8
🌐 [i18n-KO] Translated `quantization/quanto.md` to Korean (#32281)
fabxoe Aug 7, 2024
cba7bcf
🌐 [i18n-KO] Translated `image_feature_extraction.md` to Korean (#32239)
mreraser Aug 7, 2024
73a59a2
Fix references to model google mt5 small (#32497)
JuanFKurucz Aug 7, 2024
543df48
Docs: Fixed WhisperModel.forward’s docstring link (#32498)
Sai-Suraj-27 Aug 7, 2024
78566db
🌐 [i18n-KO] Translated `chat_templating.md` to Korean (#32362)
enchantee00 Aug 7, 2024
f5cdbf6
Fix link to autoclass_tutorial.md in i18n.md (#32501)
JuanFKurucz Aug 7, 2024
aefd3e2
Fix typo: depracted -> deprecated (#32489)
tomaarsen Aug 8, 2024
1c944ac
Fix issue #32518: Update llm_tutorial.md (#32523)
doomdagadiggiedahdah Aug 8, 2024
e28784f
Change Phi3 `_supports_sdpa` to True (#32457)
pocca2048 Aug 8, 2024
d3b3551
Uniformize kwargs for processors - GroundingDINO (#31964)
SangbumChoi Aug 8, 2024
b51d414
Fix add-new-model-like (#31773)
molbap Aug 8, 2024
16ed064
Add Qwen2-Audio (#32137)
faychu Aug 8, 2024
cc832cb
filter flash_attn optional imports loading remote code (#30954)
eaidova Aug 8, 2024
43f3fe8
🌐 [i18n-KO] Translated `ko-llm_tutorial_optimization.md` to Korean (#…
010kim Aug 8, 2024
96ba7f0
🌐 [i18n-KO] Translated `trainer.md` to Korean (#32260)
cjfghk5697 Aug 8, 2024
e0396bd
🌐 [i18n-KO] Translated `eetq.md` to Korean (#32352)
jun048098 Aug 8, 2024
496207a
🌐 [i18n-KO] Translated `fsdp.md` to Korean (#32261)
win2dvp21 Aug 8, 2024
b01f9c4
🌐 [i18n-KO] Translated `bitsandbytes.md` to Korean (#32408)
SeungAhSon Aug 8, 2024
0442816
Fix generate with `inputs_embeds` as input (#32493)
molbap Aug 8, 2024
0164560
Fixed test `test_static_cache_exportability` with torch 2.4.0 (#32516)
guangy10 Aug 8, 2024
54ac39c
Fix code example to load bigcode starcoder2 7b (#32474)
JuanFKurucz Aug 8, 2024
85817d9
[docs] Translation guide (#32547)
stevhliu Aug 8, 2024
838d141
Gemma2: fix FA2 generation (#32553)
zucchini-nlp Aug 9, 2024
7728b78
Fix a bug in Qwen2Audio (#32552)
faychu Aug 9, 2024
e4522fe
fix slow integration gemma2 test (#32534)
ArthurZucker Aug 9, 2024
e7f4ace
fix non contiguous tensor value error in save_pretrained (#32422)
congcongke Aug 9, 2024
48101cf
🌐 [i18n-KO] Translated `agent.md` to Korean (#32351)
Jwaminju Aug 9, 2024
7c11491
Add new model (#32615)
younesbelkada Aug 12, 2024
8f2b6d5
Fix: FA2 with packed training (#32487)
zucchini-nlp Aug 12, 2024
342e3f9
Fix sliding window attention used in Gemma2FlashAttention2 (#32522)
brcps12 Aug 12, 2024
bd251e4
fix: Fixed conditional check for `encodec` model names (#32581)
Sai-Suraj-27 Aug 12, 2024
e31a7a2
Fix `.push_to_hub(..., create_pr=True, revision="my-branch")` when cr…
Wauplin Aug 12, 2024
50837f2
Bump aiohttp from 3.9.4 to 3.10.2 in /examples/research_projects/deci…
dependabot[bot] Aug 12, 2024
8a3c55e
Bump torch from 1.13.1 to 2.2.0 in /examples/research_projects/visual…
dependabot[bot] Aug 12, 2024
b7ea171
Cleanup tool calling documentation and rename doc (#32337)
Rocketknight1 Aug 12, 2024
4996990
🌐 [i18n-KO] Translated `deepspeed.md` to Korean (#32431)
4N3MONE Aug 12, 2024
7f777ab
🌐 [i18n-KO] Translated `awq.md`to Korean (#32324)
ahnjj Aug 12, 2024
ce4b288
fix: Fixed failing `test_find_base_model_checkpoint` (#32638)
Sai-Suraj-27 Aug 12, 2024
126cbdb
Bump tensorflow from 2.11.1 to 2.12.1 in /examples/research_projects/…
dependabot[bot] Aug 12, 2024
f1c8542
"to be not" -> "not to be" (#32636)
qgallouedec Aug 12, 2024
2a5a6ad
fix: Updated the `is_torch_mps_available()` function to include `min_…
Sai-Suraj-27 Aug 12, 2024
a29eabd
Expand inputs in processors for VLMs (#30962)
zucchini-nlp Aug 13, 2024
29c3a0f
Automatically add `transformers` tag to the modelcard (#32623)
LysandreJik Aug 13, 2024
a5a8291
Fix tests (#32649)
molbap Aug 13, 2024
b5016d5
fix tensors on different devices in `WhisperGenerationMixin` (#32316)
faaany Aug 13, 2024
481e156
Add support for GrokAdamW optimizer (#32521)
ehartford Aug 13, 2024
cc25757
Add Depth Anything V2 Metric models (#32126)
bt2513 Aug 13, 2024
c3cd9d8
Fix: Fixed directory path for utils folder in `test_tokenization_util…
Sai-Suraj-27 Aug 13, 2024
5bcbdff
Modify ProcessorTesterMixin for better generalization (#32637)
yonigozlan Aug 13, 2024
9d2ab88
TF_Deberta supporting mixed precision (#32618)
pinesnow72 Aug 13, 2024
c135783
Fix tests recurrent (#32651)
molbap Aug 13, 2024
a22ff36
Support MUSA (Moore Threads GPU) backend in transformers (#31913)
fmo-mt Aug 14, 2024
df32347
fix: Fixed failing tests in `tests/utils/test_add_new_model_like.py` …
Sai-Suraj-27 Aug 14, 2024
9485289
Update translation docs review (#32662)
stevhliu Aug 14, 2024
78d78cd
Add TorchAOHfQuantizer (#32306)
jerryzh168 Aug 14, 2024
20a0449
Fix `JetMoeIntegrationTest` (#32332)
ydshieh Aug 14, 2024
6577c77
Update the distributed CPU training on Kubernetes documentation (#32669)
dmsuehir Aug 14, 2024
95a7781
fix: Fixed unknown pytest config option `doctest_glob` (#32475)
Sai-Suraj-27 Aug 14, 2024
0cea208
Unpin deepspeed in Docker image/tests (#32572)
muellerzr Aug 14, 2024
8820fe8
Updated workflows to the latest versions (#32405)
Sai-Suraj-27 Aug 14, 2024
e840127
reopen: llava-next fails to consider padding_side during Training (#3…
jp1924 Aug 15, 2024
ab7e893
fix: Corrected ` falcon-mamba-7b` model checkpoint name (#32837)
Sai-Suraj-27 Aug 15, 2024
d6751d9
fix: update doc link for runhouse in README.md (#32664)
muddlebee Aug 15, 2024
f3c8b18
VLMs: small clean-up for cache class (#32417)
zucchini-nlp Aug 16, 2024
c215523
add back the position ids (#32554)
ArthurZucker Aug 16, 2024
5fd7ca7
Use head_dim if in config for RoPE (#32495)
suiyoubi Aug 16, 2024
70d5df6
Generate: unify `LogitsWarper` and `LogitsProcessor` (#32626)
gante Aug 16, 2024
8f9fa3b
[tests] make test_sdpa_equivalence device-agnostic (#32520)
faaany Aug 16, 2024
cf32ee1
Cache: use `batch_size` instead of `max_batch_size` (#32657)
gante Aug 16, 2024
a27182b
Fix AutoConfig and AutoModel support for Llava-Next-Video (#32844)
TKONIY Aug 16, 2024
f20d0e8
improve _get_is_as_tensor_fns (#32596)
zrr1999 Aug 16, 2024
0b066be
Revert PR 32299, flag users when Zero-3 was missed (#32851)
muellerzr Aug 16, 2024
1c36db6
fix multi-gpu with static cache (#32543)
SunMarc Aug 16, 2024
8ec028a
Reduce the error log when using core models that need their weights r…
muellerzr Aug 16, 2024
6806d33
Make beam_constraints.Constraint.advance() docstring more accurate (#…
alex-calderwood Aug 16, 2024
52cb403
generate: missing `to` in DoLa body, causing exceptions in multi-gpu …
gante Aug 17, 2024
843e5e2
Add Flax Dinov2 (#31960)
MHRDYN7 Aug 19, 2024
8260cb3
Add Descript-Audio-Codec model (#31494)
kamilakesbi Aug 19, 2024
54b7703
support torch-speech (#32537)
itazap Aug 19, 2024
e55b33c
[tests] make `test_sdpa_can_compile_dynamic` device-agnostic (#32519)
faaany Aug 19, 2024
f1b720e
Add __repr__ for Conv1D (#32425)
AaronZLT Aug 19, 2024
8a4857c
Support save/load ckpt for XLA FSDP (#32311)
yitongh Aug 19, 2024
5f6c080
RT-DETR parameterized batchnorm freezing (#32631)
AlanBlanchet Aug 19, 2024
59e8f19
Fix incorrect vocab size retrieval in GGUF config (#32551)
Isotr0py Aug 19, 2024
93e538a
Mamba / FalconMamba: Fix mamba left padding (#32677)
younesbelkada Aug 19, 2024
61d89c1
Fix: Mamba2 generation mismatch between input_ids and inputs_embeds (…
vasqu Aug 19, 2024
3720484
Docs: Fixed `whisper-large-v2` model link in docs (#32871)
Sai-Suraj-27 Aug 19, 2024
85345bb
Add tip to clarify tool calling (#32883)
Rocketknight1 Aug 19, 2024
13e645b
Allow-head-dim (#32857)
ArthurZucker Aug 20, 2024
fd06ad5
🚨🚨🚨 Update min version of accelerate to 0.26.0 (#32627)
SunMarc Aug 20, 2024
65f4bc9
Fix repr for conv (#32897)
ArthurZucker Aug 20, 2024
01c4fc4
fix: jamba cache fails to use torch.nn.module (#32894)
xgal Aug 20, 2024
c63a3d0
Fix: Mamba2 `norm_before_gate` usage (#32686)
vasqu Aug 20, 2024
9800e6d
Bump nltk from 3.7 to 3.9 in /examples/research_projects/decision_tra…
dependabot[bot] Aug 20, 2024
078d5a8
Replace `tensor.norm()` with decomposed version for CLIP executorch e…
qubvel Aug 20, 2024
1dde50c
link for optimizer names (#32400)
nbroad1881 Aug 20, 2024
8713466
[i18n-ar] add README_ar.md to README.md (#32583)
AhmedAlmaghz Aug 20, 2024
c6d484e
fix: [whisper] don't overwrite GenerationConfig's `return_timestamps`…
hrl Aug 21, 2024
3bb7b05
Update docker image building (#32918)
ArthurZucker Aug 21, 2024
f6e2586
Jamba: update integration tests (#32250)
gante Aug 22, 2024
af638c4
fix: Added missing `huggingface_hub` installation to workflows (#32891)
Sai-Suraj-27 Aug 22, 2024
6baa6f2
fix: no need to dtype A in jamba (#32924)
xgal Aug 22, 2024
c42d264
FEAT / Trainer: Add adamw 4bit optimizer (#31865)
SunMarc Aug 22, 2024
8b94d28
CI: separate step to download nltk files (#32935)
gante Aug 22, 2024
eeea712
FIX / Hub: Also catch for `exceptions.ConnectionError` (#31469)
younesbelkada Aug 22, 2024
9282413
Add SynCode to llm_tutorial (#32884)
shubhamugare Aug 22, 2024
bf97d4a
Fix benchmark script (#32635)
ydshieh Aug 22, 2024
99d67f1
Improve greedy search memory usage (#32895)
regisss Aug 22, 2024
ee8c01f
Add chat_template for tokenizer extracted from GGUF model (#32908)
Isotr0py Aug 22, 2024
f1d822b
fix: (issue #32689) `AttributeError` raised when using `Trainer` with…
fshp971 Aug 22, 2024
975b988
Gemma2: eager attention by default (#32865)
gante Aug 22, 2024
18199b3
[run_slow] idefics2 (#32840)
andimarafioti Aug 22, 2024
273c0af
Fix regression on `Processor.save_pretrained` caused by #31691 (#32921)
leloykun Aug 22, 2024
09e6579
🌐 [i18n-KO] Translated `knowledge_distillation_for_image_classificati…
JinukHong Aug 22, 2024
a26de15
Generate: Deprecate returning legacy cache by default; Handle `use_ca…
gante Aug 22, 2024
d806fa3
docs: fix outdated link to TF32 explanation (#32947)
anakin87 Aug 22, 2024
22e6f14
Reducing memory usage: removing useless logits computation in generat…
Cyrilvallez Aug 23, 2024
970a16e
Forbid `PretrainedConfig` from saving `generate` parameters; Update d…
gante Aug 23, 2024
adb9117
Integrate Liger (Linkedin GPU Efficient Runtime) Kernel to Trainer (#…
JasonZhu1313 Aug 23, 2024
371b9c1
Enable some Jinja extensions and add datetime capabilities (#32684)
Rocketknight1 Aug 23, 2024
1dbd9d3
DeviceGuard added to use Deformable Attention more safely on multi-GP…
DonggeunYu Aug 23, 2024
e3a5f35
added doctring to SchedulerType class (#32898)
Arunprakash-A Aug 23, 2024
0a7af19
Update Jinja docs with new functions and general cleanup (#33097)
Rocketknight1 Aug 23, 2024
2cdc473
uniformize kwargs of Chameleon
leloykun Aug 16, 2024
43febe0
fix linter nit
leloykun Aug 16, 2024
f59ca5a
rm stride default
leloykun Aug 16, 2024
dcbfd17
add tests for chameleon processor
leloykun Aug 16, 2024
80fb7bb
fix tests
leloykun Aug 16, 2024
e57e988
fix chameleon tests
leloykun Aug 16, 2024
ed0e8aa
don't hardcode arg names
leloykun Aug 16, 2024
ae5d537
add comment on get_component
leloykun Aug 19, 2024
b252643
rm Chameleon's slow tokenizer
leloykun Aug 20, 2024
dae439c
add support for image generation and interleaved image-text generatio…
leloykun Aug 19, 2024
7607e4c
Fix issues in PR #32013
Sep 3, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
3 changes: 2 additions & 1 deletion .circleci/config.yml
Original file line number Diff line number Diff line change
Expand Up @@ -142,6 +142,7 @@ jobs:
- run: python utils/custom_init_isort.py --check_only
- run: python utils/sort_auto_mappings.py --check_only
- run: python utils/check_doc_toc.py
- run: python utils/check_docstrings.py --check_all

check_repository_consistency:
working_directory: ~/transformers
Expand Down Expand Up @@ -190,4 +191,4 @@ workflows:
- check_circleci_user
- check_code_quality
- check_repository_consistency
- fetch_all_tests
- fetch_all_tests
11 changes: 6 additions & 5 deletions .circleci/create_circleci_config.py
Original file line number Diff line number Diff line change
Expand Up @@ -121,11 +121,16 @@ def to_dict(self):
)

steps.append({"run": {"name": "Create `test-results` directory", "command": "mkdir test-results"}})

# Examples special case: we need to download NLTK files in advance to avoid cuncurrency issues
if "examples" in self.name:
steps.append({"run": {"name": "Download NLTK files", "command": """python -c "import nltk; nltk.download('punkt', quiet=True)" """}})

test_command = ""
if self.command_timeout:
test_command = f"timeout {self.command_timeout} "
# junit familiy xunit1 is necessary to support splitting on test name or class name with circleci split
test_command += f"python3 -m pytest -rsfE -p no:warnings -o junit_family=xunit1 --tb=short --junitxml=test-results/junit.xml -n {self.pytest_num_workers} " + " ".join(pytest_flags)
test_command += f"python3 -m pytest -rsfE -p no:warnings --tb=short -o junit_family=xunit1 --junitxml=test-results/junit.xml -n {self.pytest_num_workers} " + " ".join(pytest_flags)

if self.parallelism == 1:
if self.tests_to_run is None:
Expand Down Expand Up @@ -185,10 +190,6 @@ def to_dict(self):
steps.append({"store_artifacts": {"path": "tests.txt"}})
steps.append({"store_artifacts": {"path": "splitted_tests.txt"}})

test_command = ""
if self.command_timeout:
test_command = f"timeout {self.command_timeout} "
test_command += f"python3 -m pytest -rsfE -p no:warnings --tb=short -o junit_family=xunit1 --junitxml=test-results/junit.xml -n {self.pytest_num_workers} " + " ".join(pytest_flags)
test_command += " $(cat splitted_tests.txt)"
if self.marker is not None:
test_command += f" -m {self.marker}"
Expand Down
17 changes: 14 additions & 3 deletions .github/ISSUE_TEMPLATE/bug-report.yml
Original file line number Diff line number Diff line change
@@ -1,6 +1,17 @@
name: "\U0001F41B Bug Report"
description: Submit a bug report to help us improve transformers
labels: [ "bug" ]
body:
- type: markdown
attributes:
value: |
Thanks for taking the time to fill out this bug report! 🤗

Before you submit your bug report:

- If it is your first time submitting, be sure to check our [bug report guidelines](https://github.com/huggingface/transformers/blob/main/CONTRIBUTING.md#did-you-find-a-bug)
- Try our [docs bot](https://huggingface.co/spaces/huggingchat/hf-docs-chat) -- it might be able to help you with your issue

- type: textarea
id: system-info
attributes:
Expand All @@ -25,7 +36,7 @@ body:

Models:

- text models: @ArthurZucker
- text models: @ArthurZucker
- vision models: @amyeroberts
- speech models: @sanchit-gandhi
- graph models: @clefourrier
Expand All @@ -38,9 +49,9 @@ body:
- tensorflow: @gante and @Rocketknight1
- tokenizers: @ArthurZucker
- trainer: @muellerzr @SunMarc

Integrations:

- deepspeed: HF Trainer/Accelerate: @muellerzr
- ray/raytune: @richardliaw, @amogkam
- Big Model Inference: @SunMarc
Expand Down
2 changes: 1 addition & 1 deletion .github/ISSUE_TEMPLATE/i18n.md
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,7 @@ Some notes:

## Tutorial section
- [ ] [pipeline_tutorial.md](https://github.com/huggingface/transformers/blob/main/docs/source/en/pipeline_tutorial.md)
- [ ] [autoclass_tutorial.md](https://github.com/huggingface/transformers/blob/master/docs/source/autoclass_tutorial.md)
- [ ] [autoclass_tutorial.md](https://github.com/huggingface/transformers/blob/main/docs/source/en/autoclass_tutorial.md)
- [ ] [preprocessing.md](https://github.com/huggingface/transformers/blob/main/docs/source/en/preprocessing.md)
- [ ] [training.md](https://github.com/huggingface/transformers/blob/main/docs/source/en/training.md)
- [ ] [accelerate.md](https://github.com/huggingface/transformers/blob/main/docs/source/en/accelerate.md)
Expand Down
4 changes: 2 additions & 2 deletions .github/PULL_REQUEST_TEMPLATE.md
Original file line number Diff line number Diff line change
Expand Up @@ -58,9 +58,9 @@ Integrations:
- deepspeed: HF Trainer/Accelerate: @muellerzr
- ray/raytune: @richardliaw, @amogkam
- Big Model Inference: @SunMarc
- quantization (bitsandbytes, autogpt): @SunMarc
- quantization (bitsandbytes, autogpt): @SunMarc

Documentation: @stevhliu and @MKhalusova
Documentation: @stevhliu

HF projects:

Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/add-model-like.yml
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@ jobs:
sudo apt -y update && sudo apt install -y libsndfile1-dev

- name: Load cached virtual environment
uses: actions/cache@v2
uses: actions/cache@v4
id: cache
with:
path: ~/venv/
Expand Down
4 changes: 2 additions & 2 deletions .github/workflows/benchmark.yml
Original file line number Diff line number Diff line change
Expand Up @@ -31,12 +31,12 @@ jobs:
if: github.event_name == 'schedule'
working-directory: /transformers
run: |
python3 -m pip install optimum-benchmark>=0.2.0
python3 -m pip install optimum-benchmark>=0.3.0
HF_TOKEN=${{ secrets.TRANSFORMERS_BENCHMARK_TOKEN }} python3 benchmark/benchmark.py --repo_id hf-internal-testing/benchmark_results --path_in_repo $(date +'%Y-%m-%d') --config-dir benchmark/config --config-name generation --commit=${{ github.sha }} backend.model=google/gemma-2b backend.cache_implementation=null,static backend.torch_compile=false,true --multirun

- name: Benchmark (merged to main event)
if: github.event_name == 'push' && github.ref_name == 'main'
working-directory: /transformers
run: |
python3 -m pip install optimum-benchmark>=0.2.0
python3 -m pip install optimum-benchmark>=0.3.0
HF_TOKEN=${{ secrets.TRANSFORMERS_BENCHMARK_TOKEN }} python3 benchmark/benchmark.py --repo_id hf-internal-testing/benchmark_results_merge_event --path_in_repo $(date +'%Y-%m-%d') --config-dir benchmark/config --config-name generation --commit=${{ github.sha }} backend.model=google/gemma-2b backend.cache_implementation=null,static backend.torch_compile=false,true --multirun
19 changes: 16 additions & 3 deletions .github/workflows/build-ci-docker-images.yml
Original file line number Diff line number Diff line change
Expand Up @@ -27,10 +27,10 @@ jobs:
strategy:
matrix:
file: ["quality", "consistency", "custom-tokenizers", "torch-light", "tf-light", "exotic-models", "torch-tf-light", "torch-jax-light", "jax-light", "examples-torch", "examples-tf"]
continue-on-error: true
continue-on-error: true

steps:
-
-
name: Set tag
run: |
if ${{contains(github.event.head_commit.message, '[build-ci-image]')}}; then
Expand Down Expand Up @@ -61,4 +61,17 @@ jobs:
REF=${{ github.sha }}
file: "./docker/${{ matrix.file }}.dockerfile"
push: ${{ contains(github.event.head_commit.message, 'ci-image]') || github.event_name == 'schedule' }}
tags: ${{ env.TAG }}
tags: ${{ env.TAG }}

notify:
runs-on: ubuntu-22.04
if: ${{ contains(github.event.head_commit.message, '[build-ci-image]') || contains(github.event.head_commit.message, '[push-ci-image]') && '!cancelled()' || github.event_name == 'schedule' }}
steps:
- name: Post to Slack
if: ${{ contains(github.event.head_commit.message, '[push-ci-image]') && github.event_name != 'schedule' }}
uses: huggingface/hf-workflows/.github/actions/post-slack@main
with:
slack_channel: "#transformers-ci-circleci-images"
title: 🤗 New docker images for CircleCI are pushed.
status: ${{ job.status }}
slack_token: ${{ secrets.SLACK_CIFEEDBACK_BOT_TOKEN }}
2 changes: 1 addition & 1 deletion .github/workflows/check_tiny_models.yml
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@ jobs:

- uses: actions/checkout@v4
- name: Set up Python 3.8
uses: actions/setup-python@v4
uses: actions/setup-python@v5
with:
# Semantic version range syntax or exact version of a Python version
python-version: '3.8'
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/release-conda.yml
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ jobs:

steps:
- name: Checkout repository
uses: actions/checkout@v1
uses: actions/checkout@v4

- name: Install miniconda
uses: conda-incubator/setup-miniconda@v2
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/self-pr-slow-ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ on:
pull_request:
paths:
- "src/transformers/models/*/modeling_*.py"
- "tests/models/*/test_*.py"
- "tests/**/test_*.py"

concurrency:
group: ${{ github.workflow }}-${{ github.head_ref || github.run_id }}
Expand Down
1 change: 1 addition & 0 deletions .github/workflows/self-push-amd.yml
Original file line number Diff line number Diff line change
Expand Up @@ -324,6 +324,7 @@ jobs:
# We pass `needs.setup_gpu.outputs.matrix` as the argument. A processing in `notification_service.py` to change
# `models/bert` to `models_bert` is required, as the artifact names use `_` instead of `/`.
run: |
pip install huggingface_hub
pip install slack_sdk
pip show slack_sdk
python utils/notification_service.py "${{ needs.setup_gpu.outputs.matrix }}"
3 changes: 2 additions & 1 deletion .github/workflows/self-push.yml
Original file line number Diff line number Diff line change
Expand Up @@ -563,6 +563,7 @@ jobs:
# We pass `needs.setup.outputs.matrix` as the argument. A processing in `notification_service.py` to change
# `models/bert` to `models_bert` is required, as the artifact names use `_` instead of `/`.
run: |
pip install slack_sdk
pip install huggingface_hub
pip install slack_sdk
pip show slack_sdk
python utils/notification_service.py "${{ needs.setup.outputs.matrix }}"
1 change: 1 addition & 0 deletions .github/workflows/self-scheduled-amd.yml
Original file line number Diff line number Diff line change
Expand Up @@ -506,6 +506,7 @@ jobs:
# `models/bert` to `models_bert` is required, as the artifact names use `_` instead of `/`.
run: |
sudo apt-get install -y curl
pip install huggingface_hub
pip install slack_sdk
pip show slack_sdk
python utils/notification_service.py "${{ needs.setup.outputs.matrix }}"
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/stale.yml
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ jobs:
- uses: actions/checkout@v4

- name: Setup Python
uses: actions/setup-python@v4
uses: actions/setup-python@v5
with:
python-version: 3.8

Expand Down
23 changes: 6 additions & 17 deletions .github/workflows/trufflehog.yml
Original file line number Diff line number Diff line change
Expand Up @@ -10,20 +10,9 @@ jobs:
trufflehog:
runs-on: ubuntu-latest
steps:
- shell: bash
run: |
if [ "${{ github.event_name }}" == "push" ]; then
echo "depth=$(($(jq length <<< '${{ toJson(github.event.commits) }}') + 2))" >> $GITHUB_ENV
echo "branch=${{ github.ref_name }}" >> $GITHUB_ENV
fi
if [ "${{ github.event_name }}" == "pull_request" ]; then
echo "depth=$((${{ github.event.pull_request.commits }}+2))" >> $GITHUB_ENV
echo "branch=${{ github.event.pull_request.head.ref }}" >> $GITHUB_ENV
fi
- name: Checkout code
uses: actions/checkout@v4
with:
ref: ${{env.branch}}
fetch-depth: ${{env.depth}}
- name: Secret Scanning
uses: trufflesecurity/trufflehog@main
- name: Checkout code
uses: actions/checkout@v4
with:
fetch-depth: 0
- name: Secret Scanning
uses: trufflesecurity/trufflehog@main
15 changes: 9 additions & 6 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -61,7 +61,10 @@ feedback.
The 🤗 Transformers library is robust and reliable thanks to users who report the problems they encounter.

Before you report an issue, we would really appreciate it if you could **make sure the bug was not
already reported** (use the search bar on GitHub under Issues). Your issue should also be related to bugs in the library itself, and not your code. If you're unsure whether the bug is in your code or the library, please ask in the [forum](https://discuss.huggingface.co/) first. This helps us respond quicker to fixing issues related to the library versus general questions.
already reported** (use the search bar on GitHub under Issues). Your issue should also be related to bugs in the library itself, and not your code. If you're unsure whether the bug is in your code or the library, please ask in the [forum](https://discuss.huggingface.co/) or on our [discord](https://discord.com/invite/hugging-face-879548962464493619) first. This helps us respond quicker to fixing issues related to the library versus general questions.

> [!TIP]
> We have a [docs bot](https://huggingface.co/spaces/huggingchat/hf-docs-chat), and we highly encourage you to ask all your questions there. There is always a chance your bug can be fixed with a simple flag 👾🔫

Once you've confirmed the bug hasn't already been reported, please include the following information in your issue so we can quickly resolve it:

Expand Down Expand Up @@ -129,7 +132,7 @@ You will need basic `git` proficiency to contribute to
manual. Type `git --help` in a shell and enjoy! If you prefer books, [Pro
Git](https://git-scm.com/book/en/v2) is a very good reference.

You'll need **[Python 3.8](https://github.com/huggingface/transformers/blob/main/setup.py#L426)** or above to contribute to 🤗 Transformers. Follow the steps below to start contributing:
You'll need **[Python 3.8](https://github.com/huggingface/transformers/blob/main/setup.py#L449)** or above to contribute to 🤗 Transformers. Follow the steps below to start contributing:

1. Fork the [repository](https://github.com/huggingface/transformers) by
clicking on the **[Fork](https://github.com/huggingface/transformers/fork)** button on the repository's page. This creates a copy of the code
Expand Down Expand Up @@ -160,7 +163,7 @@ You'll need **[Python 3.8](https://github.com/huggingface/transformers/blob/main
If 🤗 Transformers was already installed in the virtual environment, remove
it with `pip uninstall transformers` before reinstalling it in editable
mode with the `-e` flag.

Depending on your OS, and since the number of optional dependencies of Transformers is growing, you might get a
failure with this command. If that's the case make sure to install the Deep Learning framework you are working with
(PyTorch, TensorFlow and/or Flax) then do:
Expand Down Expand Up @@ -219,7 +222,7 @@ You'll need **[Python 3.8](https://github.com/huggingface/transformers/blob/main

If you're modifying documents under the `docs/source` directory, make sure the documentation can still be built. This check will also run in the CI when you open a pull request. To run a local check
make sure you install the documentation builder:

```bash
pip install ".[docs]"
```
Expand Down Expand Up @@ -338,12 +341,12 @@ RUN_SLOW=yes python -m pytest -n auto --dist=loadfile -s -v ./tests/models/my_ne
RUN_SLOW=yes python -m pytest -n auto --dist=loadfile -s -v ./examples/pytorch/text-classification
```

Like the slow tests, there are other environment variables available which not enabled by default during testing:
Like the slow tests, there are other environment variables available which are not enabled by default during testing:
- `RUN_CUSTOM_TOKENIZERS`: Enables tests for custom tokenizers.
- `RUN_PT_FLAX_CROSS_TESTS`: Enables tests for PyTorch + Flax integration.
- `RUN_PT_TF_CROSS_TESTS`: Enables tests for TensorFlow + PyTorch integration.

More environment variables and additional information can be found in the [testing_utils.py](src/transformers/testing_utils.py).
More environment variables and additional information can be found in the [testing_utils.py](https://github.com/huggingface/transformers/blob/main/src/transformers/testing_utils.py).

🤗 Transformers uses `pytest` as a test runner only. It doesn't use any
`pytest`-specific features in the test suite itself.
Expand Down
1 change: 1 addition & 0 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -56,6 +56,7 @@ quality:
python utils/custom_init_isort.py --check_only
python utils/sort_auto_mappings.py --check_only
python utils/check_doc_toc.py
python utils/check_docstrings.py --check_all


# Format source code automatically and check is there are any problems left that need manual fixing
Expand Down
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -48,6 +48,7 @@ limitations under the License.
<a href="https://github.com/huggingface/transformers/blob/main/i18n/README_fr.md">Français</a> |
<a href="https://github.com/huggingface/transformers/blob/main/i18n/README_de.md">Deutsch</a> |
<a href="https://github.com/huggingface/transformers/blob/main/i18n/README_vi.md">Tiếng Việt</a> |
<a href="https://github.com/huggingface/transformers/blob/main/i18n/README_ar.md">العربية</a> |
</p>
</h4>

Expand Down
2 changes: 1 addition & 1 deletion benchmark/benchmark.py
Original file line number Diff line number Diff line change
Expand Up @@ -101,7 +101,7 @@ def summarize(run_dir, metrics, expand_metrics=False):
# post-processing of report: show a few selected/important metric
for metric in metrics:
keys = metric.split(".")
value = report
value = report.to_dict()
current = metrics_values
for key in keys:
# Avoid KeyError when a user's specified metric has typo.
Expand Down
11 changes: 6 additions & 5 deletions docker/consistency.dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -2,14 +2,15 @@ FROM python:3.10-slim
ENV PYTHONDONTWRITEBYTECODE=1
USER root
ARG REF=main
RUN apt-get update && apt-get install -y time git pkg-config make git-lfs
RUN apt-get update && apt-get install -y time git g++ pkg-config make git-lfs
ENV UV_PYTHON=/usr/local/bin/python
RUN pip install uv && uv venv && uv pip install --no-cache-dir -U pip setuptools GitPython
RUN uv pip install --no-cache-dir --upgrade 'torch' --index-url https://download.pytorch.org/whl/cpu
RUN uv pip install --no-cache-dir tensorflow-cpu tf-keras
RUN uv pip install --no-cache-dir "git+https://github.com/huggingface/transformers.git@${REF}#egg=transformers[flax,quality,vision,testing]"
RUN pip install --no-cache-dir --upgrade 'torch' 'torchaudio' --index-url https://download.pytorch.org/whl/cpu
# tensorflow pin matching setup.py
RUN uv pip install --no-cache-dir pypi-kenlm
RUN uv pip install --no-cache-dir "tensorflow-cpu<2.16" "tf-keras<2.16"
RUN uv pip install --no-cache-dir "git+https://github.com/huggingface/transformers.git@${REF}#egg=transformers[flax,quality,testing,torch-speech,vision]"
RUN git lfs install

RUN pip uninstall -y transformers
RUN apt-get clean && rm -rf /var/lib/apt/lists/* && apt-get autoremove && apt-get autoclean

2 changes: 1 addition & 1 deletion docker/transformers-all-latest-gpu/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ SHELL ["sh", "-lc"]
# The following `ARG` are mainly used to specify the versions explicitly & directly in this docker file, and not meant
# to be used as arguments for docker build (so far).

ARG PYTORCH='2.3.0'
ARG PYTORCH='2.4.0'
# (not always a valid torch version)
ARG INTEL_TORCH_EXT='2.3.0'
# Example: `cu102`, `cu113`, etc.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -42,7 +42,7 @@ RUN python3 -m pip uninstall -y deepspeed
# This has to be run (again) inside the GPU VMs running the tests.
# The installation works here, but some tests fail, if we don't pre-build deepspeed again in the VMs running the tests.
# TODO: Find out why test fail.
RUN DS_BUILD_CPU_ADAM=1 DS_BUILD_FUSED_ADAM=1 python3 -m pip install "deepspeed<=0.14.0" --global-option="build_ext" --global-option="-j8" --no-cache -v --disable-pip-version-check 2>&1
RUN DS_BUILD_CPU_ADAM=1 DS_BUILD_FUSED_ADAM=1 python3 -m pip install deepspeed --global-option="build_ext" --global-option="-j8" --no-cache -v --disable-pip-version-check 2>&1

# When installing in editable mode, `transformers` is not recognized as a package.
# this line must be added in order for python to be aware of transformers.
Expand Down
2 changes: 1 addition & 1 deletion docker/transformers-pytorch-gpu/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ ARG REF=main
RUN git clone https://github.com/huggingface/transformers && cd transformers && git checkout $REF

# If set to nothing, will install the latest version
ARG PYTORCH='2.3.0'
ARG PYTORCH='2.4.0'
ARG TORCH_VISION=''
ARG TORCH_AUDIO=''
# Example: `cu102`, `cu113`, etc.
Expand Down
2 changes: 1 addition & 1 deletion docs/TRANSLATING.md
Original file line number Diff line number Diff line change
Expand Up @@ -54,4 +54,4 @@ The fields you should add are `local` (with the name of the file containing the

Once you have translated the `_toctree.yml` file, you can start translating the [MDX](https://mdxjs.com/) files associated with your docs chapter.

> 🙋 If you'd like others to help you with the translation, you should [open an issue](https://github.com/huggingface/transformers/issues) and tag @stevhliu and @MKhalusova.
> 🙋 If you'd like others to help you with the translation, you should [open an issue](https://github.com/huggingface/transformers/issues) and tag @stevhliu.
Loading