Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
824 commits
Select commit Hold shift + click to select a range
4668ef1
Update notification service MI325 (#40078)
ivarflakstad Aug 12, 2025
3ff2e98
Fix PerceptionLM image preprocessing for non-tiled image input. (#40006)
shuminghu Aug 12, 2025
86bb1fc
Revert FA2 kwargs construction (#40029)
zucchini-nlp Aug 12, 2025
c6fbfab
[fix] batch inference for llava_onevision (#40021)
cyr0930 Aug 12, 2025
913c0a8
[docs] Zero Shot Object Detection Task (#40096)
ariG23498 Aug 12, 2025
1c5e17c
Update Glm4V processor and add tests (#39988)
zucchini-nlp Aug 12, 2025
f6b6e17
Add glm4.5&&glm4.5V doc (#40095)
lambertwjh Aug 12, 2025
4b3a1a6
Causal loss for `ForConditionalGeneration` (#39973)
qgallouedec Aug 12, 2025
ab455e0
Audio encodings now match conv2d weight dtype in Gemma3nAudioSSCPConv…
Malav-P Aug 12, 2025
41d1717
New DynamicSlidingWindowLayer & associated Cache (#40039)
Cyrilvallez Aug 12, 2025
952fac1
Enable SIM rules (#39806)
cyyever Aug 12, 2025
a07b5e9
feat: add `is_fast` to ImageProcessor (#39603)
MilkClouds Aug 12, 2025
b1b4655
Re-apply make style (#40106)
Cyrilvallez Aug 12, 2025
35dc888
Replace `logger.warning` with `logger.warning_once` in `GradientCheck…
qgallouedec Aug 12, 2025
f7cbd5f
Fix regression in mllama vision encoder (#40083)
Isotr0py Aug 12, 2025
2ce0dae
Switch the order of args in StaticCache (for BC and future logic) (#4…
Cyrilvallez Aug 12, 2025
085e023
Fix Qwen3 MoE GGUF architecture mismatch (#39976)
ctcanbol Aug 12, 2025
a5fac1c
Fix error on importing unavailable torch.distributed (#40038)
m-gallus Aug 12, 2025
b6ba595
Default to dequantize if cpu in device_map for mxfp4 (#39993)
MekkCyber Aug 12, 2025
9977cf1
[`Flash Attention`] Fix flash attention integration (#40002)
vasqu Aug 12, 2025
83dbebc
[trainer] ensure special tokens in model configs are aligned with tok…
gante Aug 12, 2025
0ce24f5
Fix Causality Handling in Flash Attention to Support Bidirectional At…
lucaswychan Aug 12, 2025
e5e73e4
[docs] Add reference to HF-maintained `custom_generate` collections (…
gante Aug 12, 2025
a1a4fcd
Add model card for MobileViT (#40033)
Shivamjan Aug 12, 2025
31ab716
remove sequence parallel in llama4 (#40084)
3outeille Aug 12, 2025
85d536a
🌐 [i18n-KO] Translated `tiny_agents.md` to Korean (#39913)
AhnJoonSung Aug 13, 2025
849c377
[bugfix] Fix tensor device in Idefics2, Idefics3, and SmolVLM (#39975)
qgallouedec Aug 13, 2025
060b86e
changed xLSTMRMSNorm to RMSNorm (#40113)
nikitazuevblago Aug 13, 2025
34a1fc6
Fix QuantoQuantizedCache import issues (#40109)
manueldeprada Aug 13, 2025
8d19231
[serve] allow array `content` inputs for LLMs (#39829)
gante Aug 13, 2025
e78571f
`decoding_method` argument in generate (#40085)
manueldeprada Aug 13, 2025
ebceef3
Collated reports (#40080)
ivarflakstad Aug 13, 2025
8ef5cd6
DOCS: Add missing space in SECURITY.md (#40087)
shivaheidari Aug 13, 2025
11537c3
[trainer] handle case where EOS token is None in `generation_config` …
gante Aug 13, 2025
f445cae
Fix hidden torchvision>=0.15 dependency issue (#39928)
yonigozlan Aug 13, 2025
4868445
🌐 [i18n-KO] Translated `main_classes/processors.md` to Korean (#39519)
TaskerJang Aug 13, 2025
9e21e50
🌐 [i18n-KO] Translated `jamba.md` to Korean (#39890)
skwh54 Aug 13, 2025
e4223fa
🌐 [i18n-KO] Translated `main_classes/optimizer_schedules.md` to Korea…
luckyvickyricky Aug 13, 2025
5337f30
🚨🚨 [generate] ignore `cache_implementation="hybrid"` hub defaults (#…
gante Aug 13, 2025
ac52c77
🌐 [i18n-KO] Translated `gpt2.md` to Korean (#39808)
taemincode Aug 13, 2025
127e33f
🌐 [i18n-KO] Translated `optimizers.md` to Korean (#40011)
chelsseeey Aug 13, 2025
6b728f1
🌐 [i18n-KO] Translated grounding-dino.md to Korean (#39861)
TaskerJang Aug 13, 2025
20c6b47
🚨 Use lru_cache for sine pos embeddings MaskFormer (#40007)
yonigozlan Aug 13, 2025
ab91085
🌐 [i18n-KO] Translated `pipelines.md` to Korean (#39577)
xhaktm00 Aug 13, 2025
bec6926
gpt oss is important (#40139)
ArthurZucker Aug 13, 2025
25ad9c8
Fix Janus (#40140)
Cyrilvallez Aug 13, 2025
68a13cd
Add Segment Anything 2 (SAM2) (#32317)
SangbumChoi Aug 13, 2025
eb5768a
[docs] Fix ko toctree (#40138)
stevhliu Aug 13, 2025
412c9c3
Remove an old badly designed test (#40142)
Cyrilvallez Aug 13, 2025
0f9c259
updated visualBERT modelcard (#40057)
Anil-Red Aug 13, 2025
e651ae0
🌐 [i18n-KO] Translated `gemma3.md` to Korean (#39865)
seopp Aug 13, 2025
12e49cd
Fix quantized cache with only cache_implementation in generate (#40144)
Cyrilvallez Aug 13, 2025
591708d
Add pytest marker: `torch_compile_test` and `torch_export_test` (#39950)
ydshieh Aug 13, 2025
be1ab51
Update Dockerfiles to install packages inside a virtual environment (…
Sai-Suraj-27 Aug 13, 2025
e446372
Create self-scheduled-amd-mi355-caller.yml (#40134)
glegendre01 Aug 13, 2025
252364f
[Cohere2Vision] remove unused arg (#40103)
zucchini-nlp Aug 14, 2025
22e89e5
[efficientloftr] fix bugs and follow original cross attn implementati…
sbucaille Aug 14, 2025
c47544b
Fix CI: Use correct import in SAM for torchvision InterpolationMode (…
manueldeprada Aug 14, 2025
cfe52ff
[Continous Batching] set head_dim when config.head_dim is None (#40159)
kashif Aug 14, 2025
1c5d2f7
Replace `self.tokenizer` by `self.processing_class` (#40119)
qgallouedec Aug 14, 2025
eba1d62
[FA2] Fix it finally - revert fa kwargs preparation (#40161)
Cyrilvallez Aug 14, 2025
41980ce
[bugfix] fix flash-attention2 unavailable error for Ascend NPU (#40151)
FightingZhen Aug 14, 2025
6f259bc
Fix docs typo (#40167)
qubvel Aug 14, 2025
b834cb8
build: Add fast image processor tvp (#39529)
adutchengineer Aug 14, 2025
2b6cbed
Add GptOssForSequenceClassification for GPT-OSS models (#40043)
zyfedward Aug 14, 2025
8a658ac
Standardize BARTpho model card: badges, new examples, fixed broken im…
eshwanthkartitr Aug 14, 2025
b02f2d8
Add dates to the model docs (#39320)
MHRDYN7 Aug 14, 2025
31b6e6e
Pin torch to 2.7.1 on CircleCI for now (#40174)
ydshieh Aug 14, 2025
52c6c1b
Update dynamic attnt setter for multimodals (#39908)
zucchini-nlp Aug 14, 2025
85fce2e
[MINOR:TYPO] Update base.py (#40169)
cakiki Aug 15, 2025
cc99978
make model doc device agnostic (#40143)
yao-matrix Aug 15, 2025
4912d5b
fix to avoid modifying a view in place (#40162)
3outeille Aug 15, 2025
4211756
Fix fsdp for generic-task models (#40191)
Cyrilvallez Aug 15, 2025
5068fcd
Add repr to EncoderDecoderCache (#40195)
Cyrilvallez Aug 15, 2025
c167faa
Fix typos (#40175)
cyyever Aug 15, 2025
c7afaa5
Remove _prepare_flash_attention_from_position_ids (#40069)
cyyever Aug 15, 2025
ec85d2c
Avoid CUDA stream sync (#40060)
cyyever Aug 15, 2025
28a03fb
Fix various Pylint warnings (#40107)
cyyever Aug 15, 2025
de437d0
Update: add type hints to check_tokenizers.py (#40094)
ajeet214 Aug 15, 2025
29e4e35
Benchmarking improvements (#39768)
ahadnagy Aug 15, 2025
3f4c85f
Add X-Codec model (#38248)
Manalelaidouni Aug 15, 2025
05000ae
Fix GPT-OSS `swiglu_limit` not passed in for MXFP4 (#40197)
danielhanchen Aug 15, 2025
cd22550
docs: Update LayoutLM model card according to new standardized format…
Jin-HoMLee Aug 15, 2025
2914cec
Revert "Pin torch to 2.7.1 on CircleCI for now" + Final fix for `too …
ydshieh Aug 18, 2025
6ce8f05
Use correct `model_input_names` for PixtralImageProcessor (#40226)
rohitrango Aug 18, 2025
eb2f9da
fix error vocab_size at Qwen2_5_VLForConditionalGeneration loss_funct…
killight98 Aug 18, 2025
e5886f9
[SAM 2] Change checkpoints in docs and tests (#40213)
yonigozlan Aug 18, 2025
6333eb9
Fix more typos (#40212)
cyyever Aug 18, 2025
e4bd2c8
Fix ESM token_dropout crash when using inputs_embeds instead of input…
notkisk Aug 18, 2025
2fe4337
AMD scheduled CI ref env file (#40243)
ivarflakstad Aug 18, 2025
47938f8
Add Ovis2 model and processor implementation (#37088)
thisisiron Aug 18, 2025
57e230c
Fix more pylint warnings (#40204)
cyyever Aug 18, 2025
a36d51e
🚨 Always return Cache objects in modelings (to align with generate) (…
manueldeprada Aug 18, 2025
f417a1a
remove transpose_for_scores call in ESM-2 (#40210)
pstjohn Aug 18, 2025
00b4dfb
Add `chat_template` (`jinja2`) as an extra dependency (#40128)
tboerstad Aug 18, 2025
7a0ba0d
[typing] fix type annotation error in DepthPro model image processor …
MengAiDev Aug 18, 2025
d6fad86
[serve] guard imports (#39825)
gante Aug 18, 2025
aa45824
[`CI`] Fix repo consistency (#40249)
vasqu Aug 18, 2025
2bcf9f6
Fixes for EncoderDecoderCache (#40008)
remi-or Aug 18, 2025
01c03bf
fix: Catch correct ConnectionError for additional_chat_templates (#39…
akug Aug 18, 2025
a7eabf1
Model card for NLLB (#40074)
sahil-kabir Aug 18, 2025
5986220
Correct typo and update notes in docs Readme (#40234)
PavloFesenko Aug 18, 2025
e472efb
Fix benchmark workflow (#40254)
ahadnagy Aug 18, 2025
6b5bd11
docs: Update OLMo model card (#40233)
rafakatri Aug 18, 2025
debc92e
Skip broken tests (#40157)
zucchini-nlp Aug 19, 2025
28746cd
Remove MI300 CI (#40270)
ivarflakstad Aug 19, 2025
5d9a715
set inputs_embeds to None while generate to avoid audio encoder forwa…
BakerBunker Aug 19, 2025
56c4421
[detection] fix attention mask for RT-DETR-based models (#40269)
materight Aug 19, 2025
2b59207
Fix slow static cache export tests (#40261)
jackzhxng Aug 19, 2025
a2e76b9
🚨🚨 Switch default compilation to fullgraph=False (#40137)
Cyrilvallez Aug 19, 2025
2f1a8ad
Fix setting attention for multimodal models (#39984)
zucchini-nlp Aug 19, 2025
c93594e
[detection] fix correct `k_proj` weight and bias slicing in D-FINE (#…
notkisk Aug 19, 2025
5b3b7ea
Add Kosmos-2.5 (#31711)
tic-top Aug 19, 2025
57bb6db
Skipping pytree registration in case fsdp is enabled (#40075)
romitjain Aug 19, 2025
249d7c6
Update image_processing_perception_lm_fast.py to allow for proper ove…
tyleryzhu Aug 19, 2025
bebeccb
fix which routing method (#40283)
ArthurZucker Aug 19, 2025
8636b30
Fix chat CLI GPU loading and request_id validation issues (#40230) (#…
robin-ede Aug 19, 2025
bd96e1e
docs(layoutlm): add missing `id=usage` to `<hfoptions>` tag in Layout…
Jin-HoMLee Aug 19, 2025
46d3854
Standardize RAG model card (#40222)
aayush226 Aug 19, 2025
3a4b275
docs: Update TrOCR model card to new format (#40240)
AceHunterr Aug 19, 2025
92f40da
Update model card for gpt neox japanese (#39862)
ahnjj Aug 19, 2025
6ceb13f
SmolVLM and InternVL: Ensure pixel values are converted to the correc…
qgallouedec Aug 19, 2025
0f9ce43
Standardize BertGeneration model card (#40250)
nemitha2005 Aug 19, 2025
4c01746
Adjust ROCm test output expectations (#40279)
ahadnagy Aug 19, 2025
42fe769
SmolVLM test fixes (#40275)
ahadnagy Aug 19, 2025
eaa48c8
make model docs device agnostic (2) (#40256)
yao-matrix Aug 19, 2025
0f9c908
[3/3] make docs device agnostic, all en docs for existing models done…
yao-matrix Aug 20, 2025
1d46091
Add MetaCLIP 2 (#39826)
NielsRogge Aug 20, 2025
126bc03
Allow to be able to run `torch.compile` tests with `fullgraph=True` (…
ydshieh Aug 20, 2025
a4e1fee
[`FA`] Fix dtype in varlen with position ids (#40295)
vasqu Aug 20, 2025
da9452a
[docs] delete more TF/Flax docs (#40289)
gante Aug 20, 2025
d0f1a6e
Clean up X-Codec. (#40271)
ebezzam Aug 20, 2025
a5f0b50
Remove OTel SDK dependencies (#40305)
anuraaga Aug 20, 2025
a01f38b
Fix GOT-OCR2 and Cohere2Vision image processor patches caculation (#4…
Isotr0py Aug 20, 2025
ca0aaa8
[`fix`] Pass adamw optimizer parameters to StableAdamW (#40184)
emapco Aug 20, 2025
3128db6
chore: fix typo in `find_executable_batch_size` to match new 0.9 rati…
MilkClouds Aug 20, 2025
7d2aa5d
:rotating_light: [`Flash Attention`] Fix sliding window size (#40163)
vasqu Aug 20, 2025
959239d
Remove unnecessary contiguous calls for modern torch (#40315)
Rocketknight1 Aug 20, 2025
ca543f8
Add support for Florence-2 (#38188)
ducviet00 Aug 20, 2025
a97213d
Qwen2.5-Omni test fixes (#40307)
ahadnagy Aug 20, 2025
c50f140
Add back `_tp_plan` attribute (#39944)
rishub-tamirisa Aug 20, 2025
2df0c32
byebye torch 2.1 (#40317)
Rocketknight1 Aug 20, 2025
3b72301
No more `natten` (#40287)
ydshieh Aug 20, 2025
4977ec2
[`GPT OSS`] Refactor the tests as it was not properly checking the ou…
ArthurZucker Aug 20, 2025
5d90674
Update CI with nightly torch workflow file (#40306)
ydshieh Aug 20, 2025
139cd91
Fix: Apply `get_placeholder_mask` in Ovis2 (#40280)
thisisiron Aug 20, 2025
1054494
Update notification service amd_daily_ci_workflows definition (#40314)
ivarflakstad Aug 20, 2025
242bb2c
One cache class to rule them all (#40276)
Cyrilvallez Aug 20, 2025
c2e3cc2
Fix chunked attention mask with left-padding (#40324)
Cyrilvallez Aug 21, 2025
c99ed49
[docs] remove flax references from `/en/model_doc` (#40311)
gante Aug 21, 2025
022af24
Fix qwen-omni processor text only mode (#40336)
yuekaizhang Aug 21, 2025
1e2e28f
Change Qwen2RMSNorm to RMSNorm from PyTorch (#40066)
cyyever Aug 21, 2025
adf84ae
Add DeepseekV3ForSequenceClassification for Deepseek V3 models (#40200)
abdokaseb Aug 21, 2025
6ad7f29
Fix deprecation warning version (#40343)
Cyrilvallez Aug 21, 2025
7b060e5
Add missing arguments to class constructors (#40068)
cyyever Aug 21, 2025
c031f6f
[docs] remove TF references from `/en/model_doc` (#40344)
gante Aug 21, 2025
5c88d8f
Fix: Only call Trainer.align_special_tokens if model has "config" att…
tomaarsen Aug 21, 2025
e95441b
add type hints (#40319)
wirthual Aug 21, 2025
c7e6f9a
Fix an infinite loop bug in recursive search of relative imports (#40…
eladsegal Aug 21, 2025
c4513a9
Fix links in Glm4vMoe configuration classes to point to the correct H…
vvvdwbvvv Aug 21, 2025
11a49dd
T5 test and target device fixes (#40313)
ahadnagy Aug 21, 2025
7f2f534
Update `test_spm_converter_bytefallback_warning` (#40284)
ydshieh Aug 21, 2025
1e1db12
(small) fix conditional for input_ids and input_embeds in marian (#40…
Aug 21, 2025
04b751f
Fix attention vizualizer (#40285)
molbap Aug 21, 2025
75aa7c7
[ModernBert] Prevent the attention mask from being None in ModernBert…
ashmikuz Aug 21, 2025
b40b834
Clean up XCodec and other codecs (#40348)
ebezzam Aug 21, 2025
2121d09
[serve] add cors warnings (#40112)
gante Aug 21, 2025
128f42d
[detection] use consistent dtype for Conditional and DAB DETR positio…
agkphysics Aug 21, 2025
f46f29d
Remove more PyTorch 2.2 compatible code (#40337)
cyyever Aug 21, 2025
cb1df4d
[`FA`] Fix some model tests (#40350)
vasqu Aug 21, 2025
7f38068
Qwen2.5-VL test fixes for ROCm (#40308)
ahadnagy Aug 21, 2025
9568b50
[generate] handle support for cache classes when num enc layers != nu…
gante Aug 21, 2025
7c1169e
[4/N]more docs to device agnostic (#40355)
yao-matrix Aug 21, 2025
8365f70
DOCS: Clarification on the use of `label_names` as an argument to Tra…
huzaifa-jawad367 Aug 22, 2025
cf487cd
HunYuan opensource (#39606)
yjc9696 Aug 22, 2025
d7fe311
Fix idefics3 vision embeddings indices dtype (#40360)
Isotr0py Aug 22, 2025
e018b77
wav2vec2 fixes (#40341)
remi-or Aug 22, 2025
5c40e7a
Change multimodal data links to HF hub (#40309)
zucchini-nlp Aug 22, 2025
9c25820
[pipelines] add support to `skip_special_tokens` in the main text gen…
gante Aug 22, 2025
d8f6d37
⚠️⚠️ Use `dtype` instead of `torch_dtype` everywhere! (#39782)
Cyrilvallez Aug 22, 2025
19ffe02
[processor] move commonalities to mixin (#40339)
zucchini-nlp Aug 22, 2025
7db228a
[configuration] allow to overwrite kwargs from subconfigs (#40241)
zucchini-nlp Aug 22, 2025
8a6908c
fix(example): align parameter names with the latest function definiti…
developer0hye Aug 22, 2025
56d68c6
Addiing ByteDance Seed Seed-OSS (#40272)
Fazziekey Aug 22, 2025
894b2d8
Add GptOssForTokenClassification for GPT-OSS models (#40190)
abdokaseb Aug 22, 2025
0a21e87
Bug Fix: Dynamically set return_lse flag in FlexAttention (#40352)
amd-lalithnc Aug 22, 2025
dab66f1
Chat Template Doc Fixes (#40173)
Rocketknight1 Aug 22, 2025
29ddcac
Rework the Cache documentation (#40373)
Cyrilvallez Aug 22, 2025
7d88f57
Update README_zh-hans.md (#40380)
TardC Aug 22, 2025
28ca27c
HF papers in doc (#40381)
qgallouedec Aug 22, 2025
4f9b4e6
Run FA2 tests in CI (#40397)
ydshieh Aug 23, 2025
2c55c7f
Reactivate a lot of tests skipped for no reason anymore (#40378)
Cyrilvallez Aug 25, 2025
ba095d3
:broom: :broom: :broom: Get set decoder cleanup (#39509)
molbap Aug 25, 2025
14b89fe
fix to accept cumulative_seqlens from TransformersKwargs in FA (#40194)
Kurt232 Aug 25, 2025
0031c04
[docs] flax/jax purge (#40372)
gante Aug 25, 2025
a2b37bf
Fix typo: 'casual' -> 'causal' in code and documentation (#40371) (#4…
akintunero Aug 25, 2025
4029913
Fix CI (hunyuan moe does not support fullgraph) (#40423)
Cyrilvallez Aug 25, 2025
11e12a7
Fix typo: 'seperator' to 'separator' in variable names (#40389)
Prawal-Sharma Aug 25, 2025
d73181b
Fix UnboundLocalError in WER metric computation (#40402)
prxshetty Aug 25, 2025
a0a37b3
Gpt oss optim (#40304)
jiqing-feng Aug 25, 2025
3b5b9f6
Fix processing tests (#40379)
zucchini-nlp Aug 25, 2025
04c2bae
Fix label smoothing incompatibility with multi-label classification (…
avchauzov Aug 25, 2025
6bf6f84
[`Mxfp4`] Add a way to save with a quantization method (#40176)
ArthurZucker Aug 25, 2025
ea8d9c8
🚨 Remove DoLa decoding strategy (#40082)
manueldeprada Aug 25, 2025
399cd5c
Fix modular for modernbert-decoder (#40431)
Cyrilvallez Aug 25, 2025
1a35d07
Update collated reports working directory and --path (#40433)
ivarflakstad Aug 25, 2025
d8f2edc
Add `tokenizer_kwargs` argument to the text generation pipeline (#40…
Joshua-Chin Aug 25, 2025
eac4f00
Fix typo and improve GPU kernel check error message in MXFP4 quantiza…
akintunero Aug 25, 2025
1763ef2
[docs] remove last references to `transformers` TF classes/methods (#…
gante Aug 25, 2025
6b5eab7
Remove working-dir from collated reports job (#40435)
ivarflakstad Aug 25, 2025
c81723d
🌐 [i18n-KO] Translated `models.md` to Korean (#39518)
Judy-Choi Aug 25, 2025
ef40690
Gemma3 text fixes: Add expectations for MI325 (#40384)
ahadnagy Aug 25, 2025
f0e87b4
Fix collated reports model directory traversal (#40437)
ivarflakstad Aug 25, 2025
fa59cf9
Fix https://github.com/huggingface/transformers/issues/40292 (#40439)
id01 Aug 25, 2025
7637d29
Fix collated reports uploading (#40440)
ivarflakstad Aug 25, 2025
8ce633c
InternVL MI325 test expectations (#40387)
ahadnagy Aug 25, 2025
e68146f
Fix collated reports model name entry (#40441)
ivarflakstad Aug 25, 2025
922e65b
Fix non FA2 tests after FA2 installed in CI docker image (#40430)
ydshieh Aug 26, 2025
63caaea
Refactor ViT-like models (#39816)
qubvel Aug 26, 2025
6d2bb1e
[Trainer] accelerate contextparallel support in trainer (#40205)
kashif Aug 26, 2025
64ae6e6
fix qwen25-vl grad acc (#40333)
iMountTai Aug 26, 2025
f690a2a
[video processors] decode only sampled videos -> less RAM and faster …
zucchini-nlp Aug 26, 2025
32fcc24
rename get_cuda_warm_up_factor to get_accelerator_warm_up_factor (#40…
yao-matrix Aug 26, 2025
b8184b7
Make cache_config not mandatory (#40316)
remi-or Aug 26, 2025
49e168f
🚨 Remove Contrastive Search decoding strategy (#40428)
manueldeprada Aug 26, 2025
34108a2
Continuous batching refactor (#40426)
remi-or Aug 26, 2025
58cebc8
flash_paged: s_aux may not exist (#40434)
pcuenca Aug 26, 2025
263d06f
Fix extra template loading (#40455)
Rocketknight1 Aug 26, 2025
0ce6709
deci gguf support (#38669)
ved1beta Aug 26, 2025
5a8ba87
[fast_image_processor] fix image normalization for resize (#40436)
audioXD Aug 26, 2025
6451294
[RoPE] explicit factor > implicit factor in YaRN (#40320)
gante Aug 26, 2025
78f32c3
[pipeline] Add Keypoint Matching pipeline (#39970)
sbucaille Aug 26, 2025
c8c7623
Update SegFormer model card (#40417)
GSNCodes Aug 26, 2025
74ad608
Not to shock AMD team by the cancelled workflow run notification ❤️ 💖…
ydshieh Aug 26, 2025
ff8b88a
Fix nightly torch CI (#40469)
ydshieh Aug 26, 2025
bb90351
O26 sync (#1)
Guo-Chenxu Aug 27, 2025
80f4c0c
CI when PR merged to `main` (#40451)
ydshieh Aug 27, 2025
75d6f17
Validate GptOssConfig rope config after it's fully initialized (#40474)
zifeitong Aug 27, 2025
a3afebb
[modular] Use multi-processing + fix model import issue (#40481)
Cyrilvallez Aug 27, 2025
8b80431
[modular] Remove ambiguity in all calls to parent class methods + fix…
Cyrilvallez Aug 27, 2025
ed5dd29
[ESM] support attention API (#40370)
zucchini-nlp Aug 27, 2025
52aaa3f
[EfficientLoFTR] dynamic image size support (#40329)
sbucaille Aug 27, 2025
6350636
Fix `qwen2_moe` tests (#40494)
ydshieh Aug 27, 2025
3c343c6
[Whisper] Add rocm expected results to certain tests (#40482)
ivarflakstad Aug 27, 2025
304225a
Collated reports: no need to upload artifact (#40502)
ivarflakstad Aug 27, 2025
821384d
Fix the CI workflow of `merge to main` (#40503)
ydshieh Aug 27, 2025
e3d8fd7
docs(pixtral): Update Pixtral model card to new format (#40442)
BryanBradfo Aug 27, 2025
98289c5
[modular] Classes can now be defined and referenced in arbitrary orde…
Cyrilvallez Aug 27, 2025
cf2a81c
Merge branch 'main' into minicpm_o_2_6
Guo-Chenxu Aug 28, 2025
05a2d72
ruff format
Guo-Chenxu Aug 28, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
15 changes: 8 additions & 7 deletions .circleci/create_circleci_config.py
Original file line number Diff line number Diff line change
Expand Up @@ -109,7 +109,9 @@ def __post_init__(self):
self.docker_image[0]["image"] = f"{self.docker_image[0]['image']}:dev"
print(f"Using {self.docker_image} docker image")
if self.install_steps is None:
self.install_steps = ["uv venv && uv pip install ."]
self.install_steps = ["uv pip install ."]
# Use a custom patched pytest to force exit the process at the end, to avoid `Too long with no output (exceeded 10m0s): context deadline exceeded`
self.install_steps.append("uv pip install git+https://github.com/ydshieh/pytest.git@8.4.1-ydshieh")
if self.pytest_options is None:
self.pytest_options = {}
if isinstance(self.tests_to_run, str):
Expand Down Expand Up @@ -213,7 +215,7 @@ def job_name(self):
docker_image=[{"image": "huggingface/transformers-torch-light"}],
# networkx==3.3 (after #36957) cause some issues
# TODO: remove this once it works directly
install_steps=["uv venv && uv pip install ."],
install_steps=["uv pip install ."],
marker="generate",
parallelism=6,
)
Expand Down Expand Up @@ -250,7 +252,7 @@ def job_name(self):
additional_env={"OMP_NUM_THREADS": 8},
docker_image=[{"image":"huggingface/transformers-examples-torch"}],
# TODO @ArthurZucker remove this once docker is easier to build
install_steps=["uv venv && uv pip install . && uv pip install -r examples/pytorch/_tests_requirements.txt"],
install_steps=["uv pip install . && uv pip install -r examples/pytorch/_tests_requirements.txt"],
pytest_num_workers=4,
)

Expand All @@ -259,7 +261,7 @@ def job_name(self):
additional_env={"HUGGINGFACE_CO_STAGING": True},
docker_image=[{"image":"huggingface/transformers-torch-light"}],
install_steps=[
'uv venv && uv pip install .',
'uv pip install .',
'git config --global user.email "ci@dummy.com"',
'git config --global user.name "ci"',
],
Expand All @@ -273,7 +275,6 @@ def job_name(self):
"onnx",
docker_image=[{"image":"huggingface/transformers-torch-tf-light"}],
install_steps=[
"uv venv",
"uv pip install .[testing,sentencepiece,onnxruntime,vision,rjieba]",
],
pytest_options={"k onnx": None},
Expand Down Expand Up @@ -303,7 +304,7 @@ def job_name(self):
docker_image=[{"image": "huggingface/transformers-torch-light"}],
# networkx==3.3 (after #36957) cause some issues
# TODO: remove this once it works directly
install_steps=["uv venv && uv pip install ."],
install_steps=["uv pip install .[serving]"],
marker="not generate",
parallelism=6,
)
Expand All @@ -321,7 +322,7 @@ def job_name(self):
additional_env={"TRANSFORMERS_VERBOSITY": "error", "DATASETS_VERBOSITY": "error", "SKIP_CUDA_DOCTEST": "1"},
install_steps=[
# Add an empty file to keep the test step running correctly even no file is selected to be tested.
"uv venv && pip install .",
"uv pip install .",
"touch dummy.py",
command,
"cat pr_documentation_tests_temp.txt",
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/benchmark.yml
Original file line number Diff line number Diff line change
Expand Up @@ -48,7 +48,7 @@ jobs:

- name: Run database init script
run: |
psql -f benchmark/init_db.sql
psql -f benchmark/utils/init_db.sql
env:
PGDATABASE: metrics
PGHOST: ${{ secrets.TRANSFORMERS_BENCHMARKS_PGHOST }}
Expand Down
7 changes: 5 additions & 2 deletions .github/workflows/check_failed_tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,9 @@ on:
report_repo_id:
required: true
type: string
commit_sha:
required: false
type: string


env:
Expand All @@ -41,7 +44,7 @@ jobs:
check_new_failures:
name: " "
runs-on:
group: aws-g4dn-4xlarge-cache
group: aws-g5-4xlarge-cache
container:
image: ${{ inputs.docker }}
options: --gpus all --shm-size "16gb" --ipc host -v /mnt/cache/.cache/huggingface:/mnt/cache/
Expand Down Expand Up @@ -87,7 +90,7 @@ jobs:
- name: Update clone
working-directory: /transformers
if: ${{ env.process == 'true' }}
run: git fetch && git checkout ${{ github.sha }}
run: git fetch && git checkout ${{ inputs.commit_sha || github.sha }}

- name: Get target commit
working-directory: /transformers/utils
Expand Down
43 changes: 43 additions & 0 deletions .github/workflows/collated-reports.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
name: CI collated reports

on:
workflow_call:
inputs:
job:
required: true
type: string
report_repo_id:
required: true
type: string
machine_type:
required: true
type: string
gpu_name:
description: Name of the GPU used for the job. Its enough that the value contains the name of the GPU, e.g. "noise-h100-more-noise". Case insensitive.
required: true
type: string

jobs:
collated_reports:
name: Collated reports
runs-on: ubuntu-22.04
if: always()
steps:
- uses: actions/checkout@v4
- uses: actions/download-artifact@v4

- name: Collated reports
shell: bash
env:
ACCESS_REPO_INFO_TOKEN: ${{ secrets.ACCESS_REPO_INFO_TOKEN }}
CI_SHA: ${{ github.sha }}
TRANSFORMERS_CI_RESULTS_UPLOAD_TOKEN: ${{ secrets.TRANSFORMERS_CI_RESULTS_UPLOAD_TOKEN }}
run: |
pip install huggingface_hub
python3 utils/collated_reports.py \
--path . \
--machine-type ${{ inputs.machine_type }} \
--commit-hash ${{ env.CI_SHA }} \
--job ${{ inputs.job }} \
--report-repo-id ${{ inputs.report_repo_id }} \
--gpu-name ${{ inputs.gpu_name }}
4 changes: 2 additions & 2 deletions .github/workflows/doctest_job.yml
Original file line number Diff line number Diff line change
Expand Up @@ -28,10 +28,10 @@ jobs:
matrix:
split_keys: ${{ fromJson(inputs.split_keys) }}
runs-on:
group: aws-g4dn-4xlarge-cache
group: aws-g5-4xlarge-cache
container:
image: huggingface/transformers-all-latest-gpu
options: --gpus 0 --shm-size "16gb" --ipc host -v /mnt/cache/.cache/huggingface:/mnt/cache/
options: --gpus all --shm-size "16gb" --ipc host -v /mnt/cache/.cache/huggingface:/mnt/cache/
steps:
- name: Update clone
working-directory: /transformers
Expand Down
4 changes: 2 additions & 2 deletions .github/workflows/doctests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -15,10 +15,10 @@ jobs:
setup:
name: Setup
runs-on:
group: aws-g4dn-4xlarge-cache
group: aws-g5-4xlarge-cache
container:
image: huggingface/transformers-all-latest-gpu
options: --gpus 0 --shm-size "16gb" --ipc host -v /mnt/cache/.cache/huggingface:/mnt/cache/
options: --gpus all --shm-size "16gb" --ipc host -v /mnt/cache/.cache/huggingface:/mnt/cache/
outputs:
job_splits: ${{ steps.set-matrix.outputs.job_splits }}
split_keys: ${{ steps.set-matrix.outputs.split_keys }}
Expand Down
157 changes: 157 additions & 0 deletions .github/workflows/get-pr-info.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,157 @@
name: Get PR commit SHA
on:
workflow_call:
inputs:
pr_number:
required: true
type: string
outputs:
PR_HEAD_REPO_FULL_NAME:
description: "The full name of the repository from which the pull request is created"
value: ${{ jobs.get-pr-info.outputs.PR_HEAD_REPO_FULL_NAME }}
PR_BASE_REPO_FULL_NAME:
description: "The full name of the repository to which the pull request is created"
value: ${{ jobs.get-pr-info.outputs.PR_BASE_REPO_FULL_NAME }}
PR_HEAD_REPO_OWNER:
description: "The owner of the repository from which the pull request is created"
value: ${{ jobs.get-pr-info.outputs.PR_HEAD_REPO_OWNER }}
PR_BASE_REPO_OWNER:
description: "The owner of the repository to which the pull request is created"
value: ${{ jobs.get-pr-info.outputs.PR_BASE_REPO_OWNER }}
PR_HEAD_REPO_NAME:
description: "The name of the repository from which the pull request is created"
value: ${{ jobs.get-pr-info.outputs.PR_HEAD_REPO_NAME }}
PR_BASE_REPO_NAME:
description: "The name of the repository to which the pull request is created"
value: ${{ jobs.get-pr-info.outputs.PR_BASE_REPO_NAME }}
PR_HEAD_REF:
description: "The branch name of the pull request in the head repository"
value: ${{ jobs.get-pr-info.outputs.PR_HEAD_REF }}
PR_BASE_REF:
description: "The branch name in the base repository (to merge into)"
value: ${{ jobs.get-pr-info.outputs.PR_BASE_REF }}
PR_HEAD_SHA:
description: "The head sha of the pull request branch in the head repository"
value: ${{ jobs.get-pr-info.outputs.PR_HEAD_SHA }}
PR_BASE_SHA:
description: "The head sha of the target branch in the base repository"
value: ${{ jobs.get-pr-info.outputs.PR_BASE_SHA }}
PR_MERGE_COMMIT_SHA:
description: "The sha of the merge commit for the pull request (created by GitHub) in the base repository"
value: ${{ jobs.get-pr-info.outputs.PR_MERGE_COMMIT_SHA }}
PR_HEAD_COMMIT_DATE:
description: "The date of the head sha of the pull request branch in the head repository"
value: ${{ jobs.get-pr-info.outputs.PR_HEAD_COMMIT_DATE }}
PR_MERGE_COMMIT_DATE:
description: "The date of the merge commit for the pull request (created by GitHub) in the base repository"
value: ${{ jobs.get-pr-info.outputs.PR_MERGE_COMMIT_DATE }}
PR_HEAD_COMMIT_TIMESTAMP:
description: "The timestamp of the head sha of the pull request branch in the head repository"
value: ${{ jobs.get-pr-info.outputs.PR_HEAD_COMMIT_TIMESTAMP }}
PR_MERGE_COMMIT_TIMESTAMP:
description: "The timestamp of the merge commit for the pull request (created by GitHub) in the base repository"
value: ${{ jobs.get-pr-info.outputs.PR_MERGE_COMMIT_TIMESTAMP }}
PR:
description: "The PR"
value: ${{ jobs.get-pr-info.outputs.PR }}
PR_FILES:
description: "The files touched in the PR"
value: ${{ jobs.get-pr-info.outputs.PR_FILES }}


jobs:
get-pr-info:
runs-on: ubuntu-22.04
name: Get PR commit SHA better
outputs:
PR_HEAD_REPO_FULL_NAME: ${{ steps.pr_info.outputs.head_repo_full_name }}
PR_BASE_REPO_FULL_NAME: ${{ steps.pr_info.outputs.base_repo_full_name }}
PR_HEAD_REPO_OWNER: ${{ steps.pr_info.outputs.head_repo_owner }}
PR_BASE_REPO_OWNER: ${{ steps.pr_info.outputs.base_repo_owner }}
PR_HEAD_REPO_NAME: ${{ steps.pr_info.outputs.head_repo_name }}
PR_BASE_REPO_NAME: ${{ steps.pr_info.outputs.base_repo_name }}
PR_HEAD_REF: ${{ steps.pr_info.outputs.head_ref }}
PR_BASE_REF: ${{ steps.pr_info.outputs.base_ref }}
PR_HEAD_SHA: ${{ steps.pr_info.outputs.head_sha }}
PR_BASE_SHA: ${{ steps.pr_info.outputs.base_sha }}
PR_MERGE_COMMIT_SHA: ${{ steps.pr_info.outputs.merge_commit_sha }}
PR_HEAD_COMMIT_DATE: ${{ steps.pr_info.outputs.head_commit_date }}
PR_MERGE_COMMIT_DATE: ${{ steps.pr_info.outputs.merge_commit_date }}
PR_HEAD_COMMIT_TIMESTAMP: ${{ steps.get_timestamps.outputs.head_commit_timestamp }}
PR_MERGE_COMMIT_TIMESTAMP: ${{ steps.get_timestamps.outputs.merge_commit_timestamp }}
PR: ${{ steps.pr_info.outputs.pr }}
PR_FILES: ${{ steps.pr_info.outputs.files }}
if: ${{ inputs.pr_number != '' }}
steps:
- name: Extract PR details
id: pr_info
uses: actions/github-script@v6
with:
script: |
const { data: pr } = await github.rest.pulls.get({
owner: context.repo.owner,
repo: context.repo.repo,
pull_number: ${{ inputs.pr_number }}
});

const { data: head_commit } = await github.rest.repos.getCommit({
owner: pr.head.repo.owner.login,
repo: pr.head.repo.name,
ref: pr.head.ref
});

const { data: merge_commit } = await github.rest.repos.getCommit({
owner: pr.base.repo.owner.login,
repo: pr.base.repo.name,
ref: pr.merge_commit_sha,
});

const { data: files } = await github.rest.pulls.listFiles({
owner: context.repo.owner,
repo: context.repo.repo,
pull_number: ${{ inputs.pr_number }}
});

core.setOutput('head_repo_full_name', pr.head.repo.full_name);
core.setOutput('base_repo_full_name', pr.base.repo.full_name);
core.setOutput('head_repo_owner', pr.head.repo.owner.login);
core.setOutput('base_repo_owner', pr.base.repo.owner.login);
core.setOutput('head_repo_name', pr.head.repo.name);
core.setOutput('base_repo_name', pr.base.repo.name);
core.setOutput('head_ref', pr.head.ref);
core.setOutput('base_ref', pr.base.ref);
core.setOutput('head_sha', pr.head.sha);
core.setOutput('base_sha', pr.base.sha);
core.setOutput('merge_commit_sha', pr.merge_commit_sha);
core.setOutput('pr', pr);

core.setOutput('head_commit_date', head_commit.commit.committer.date);
core.setOutput('merge_commit_date', merge_commit.commit.committer.date);

core.setOutput('files', files);

console.log('PR head commit:', {
head_commit: head_commit,
commit: head_commit.commit,
date: head_commit.commit.committer.date
});

console.log('PR merge commit:', {
merge_commit: merge_commit,
commit: merge_commit.commit,
date: merge_commit.commit.committer.date
});

- name: Convert dates to timestamps
id: get_timestamps
run: |
head_commit_date=${{ steps.pr_info.outputs.head_commit_date }}
merge_commit_date=${{ steps.pr_info.outputs.merge_commit_date }}
echo $head_commit_date
echo $merge_commit_date
head_commit_timestamp=$(date -d "$head_commit_date" +%s)
merge_commit_timestamp=$(date -d "$merge_commit_date" +%s)
echo $head_commit_timestamp
echo $merge_commit_timestamp
echo "head_commit_timestamp=$head_commit_timestamp" >> $GITHUB_OUTPUT
echo "merge_commit_timestamp=$merge_commit_timestamp" >> $GITHUB_OUTPUT
36 changes: 36 additions & 0 deletions .github/workflows/get-pr-number.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
name: Get PR number
on:
workflow_call:
outputs:
PR_NUMBER:
description: "The extracted PR number"
value: ${{ jobs.get-pr-number.outputs.PR_NUMBER }}

jobs:
get-pr-number:
runs-on: ubuntu-22.04
name: Get PR number
outputs:
PR_NUMBER: ${{ steps.set_pr_number.outputs.PR_NUMBER }}
steps:
- name: Get PR number
shell: bash
run: |
if [[ "${{ github.event.issue.number }}" != "" && "${{ github.event.issue.pull_request }}" != "" ]]; then
echo "PR_NUMBER=${{ github.event.issue.number }}" >> $GITHUB_ENV
elif [[ "${{ github.event.pull_request.number }}" != "" ]]; then
echo "PR_NUMBER=${{ github.event.pull_request.number }}" >> $GITHUB_ENV
elif [[ "${{ github.event.pull_request }}" != "" ]]; then
echo "PR_NUMBER=${{ github.event.number }}" >> $GITHUB_ENV
else
echo "PR_NUMBER=" >> $GITHUB_ENV
fi

- name: Check PR number
shell: bash
run: |
echo "${{ env.PR_NUMBER }}"

- name: Set PR number
id: set_pr_number
run: echo "PR_NUMBER=${{ env.PR_NUMBER }}" >> "$GITHUB_OUTPUT"
9 changes: 6 additions & 3 deletions .github/workflows/model_jobs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,9 @@ on:
docker:
required: true
type: string
commit_sha:
required: false
type: string
report_name_prefix:
required: false
default: run_models_gpu
Expand Down Expand Up @@ -70,7 +73,7 @@ jobs:

- name: Update clone
working-directory: /transformers
run: git fetch && git checkout ${{ github.sha }}
run: git fetch && git checkout ${{ inputs.commit_sha || github.sha }}

- name: Reinstall transformers in edit mode (remove the one installed during docker image build)
working-directory: /transformers
Expand Down Expand Up @@ -107,9 +110,9 @@ jobs:
run: |
echo "${{ inputs.machine_type }}"

if [ "${{ inputs.machine_type }}" = "aws-g4dn-4xlarge-cache" ]; then
if [ "${{ inputs.machine_type }}" = "aws-g5-4xlarge-cache" ]; then
machine_type=single-gpu
elif [ "${{ inputs.machine_type }}" = "aws-g4dn-12xlarge-cache" ]; then
elif [ "${{ inputs.machine_type }}" = "aws-g5-12xlarge-cache" ]; then
machine_type=multi-gpu
else
machine_type=${{ inputs.machine_type }}
Expand Down
Loading
Loading