v4.8.0 Integration with the Hub and Flax/JAX support

Integration with the Hub

Our example scripts and Trainer are now optimized for publishing your model on the Hugging Face Hub, with Tensorboard training metrics, and an automatically authored model card which contains all the relevant metadata, including evaluation results.

Trainer Hub integration

Use --push_to_hub to create a model repo for your training and it will be saved with all relevant metadata at the end of the training.

Other flags are:

push_to_hub_model_id to control the repo name
push_to_hub_organization to specify an organization

Visualizing Training metrics on huggingface.co (based on Tensorboard)

By default if you have tensorboard installed the training scripts will use it to log, and the logging traces folder is conveniently located inside your model output directory, so you can push them to your model repo by default.

Any model repo that contains Tensorboard traces will spawn a Tensorboard server:

which makes it very convenient to see how the training went! This Hub feature is in Beta so let us know if anything looks weird :)

See this model repo

Model card generation

The model card contains info about the datasets used, the eval results, ...

Many users were already adding their eval results to their model cards in markdown format, but this is a more structured way of adding them which will make it easier to parse and e.g. represent in leaderboards such as the ones on Papers With Code!

We use a format specified in collaboration with [PaperswithCode] (https://github.com/huggingface/huggingface_hub/blame/main/modelcard.md), see also this repo.

Model, tokenizer and configurations

All models, tokenizers and configurations having a revamp push_to_hub() method as well as a push_to_hub argument in their save_pretrained() method. The workflow of this method is changed a bit to be more like git, with a local clone of the repo in a folder of the working directory, to make it easier to apply patches (use use_temp_dir=True to clone in temporary folders for the same behavior as the experimental API).

Clean push to hub API #12187 (@sgugger)

Flax/JAX support

Flax/JAX is becoming a fully supported backend of the Transformers library with more models having an implementation in it. BART, CLIP and T5 join the already existing models, find the whole list here.

[Flax] FlaxAutoModelForSeq2SeqLM #12228 (@patil-suraj)
[FlaxBart] few small fixes #12247 (@patil-suraj)
[FlaxClip] fix test from/save pretrained test #12284 (@patil-suraj)
[Flax] [WIP] allow loading head model with base model weights #12255 (@patil-suraj)
[Flax] Fix flax test save pretrained #12256 (@patrickvonplaten)
[Flax] Add jax flax to env command #12251 (@patrickvonplaten)
add FlaxAutoModelForImageClassification in main init #12298 (@patil-suraj)
Flax T5 #12150 (@vasudevgupta7)
[Flax T5] Fix weight initialization and fix docs #12327 (@patrickvonplaten)
Flax summarization script #12230 (@patil-suraj)
FlaxBartPretrainedModel -> FlaxBartPreTrainedModel #12313 (@sgugger)

General improvements and bug fixes

AutoTokenizer: infer the class from the tokenizer config if possible #12208 (@sgugger)
update desc for map in all examples #12226 (@bhavitvyamalik)
Depreciate pythonic Mish and support PyTorch 1.9 version of Mish #12240 (@digantamisra98)
[t5 doc] make the example work out of the box #12239 (@stas00)
Better CI feedback #12279 (@LysandreJik)
Fix for making student ProphetNet for Seq2Seq Distillation #12130 (@vishal-burman)
[DeepSpeed] don't ignore --adafactor #12257 (@stas00)
Tensorflow QA example #12252 (@Rocketknight1)
[tests] reset report_to to none, avoid deprecation warning #12293 (@stas00)
[trainer + examples] set log level from CLI #12276 (@stas00)
[tests] multiple improvements #12294 (@stas00)
Trainer: adjust wandb installation example #12291 (@stefan-it)
Fix and improve documentation for LEDForConditionalGeneration #12303 (@ionicsolutions)
[Flax] Main doc for event orga #12305 (@patrickvonplaten)
[trainer] 2 bug fixes and a rename #12309 (@stas00)
[docs] performance #12258 (@stas00)
Add CodeCarbon Integration #12304 (@JetRunner)
Optimizing away the fill-mask pipeline. #12113 (@Narsil)
Add output in a dictionary for TF generate method #12139 (@stancld)
Rewrite ProphetNet to adapt converting ONNX friendly #11981 (@jiafatom)
Add mention of the huggingface_hub methods for offline mode #12320 (@LysandreJik)
[Flax/JAX] Add how to propose projects markdown #12311 (@patrickvonplaten)
[TFWav2Vec2] Fix docs #12283 (@chenht2010)
Add all XxxPreTrainedModel to the main init #12314 (@sgugger)
Conda build #12323 (@LysandreJik)
Changed modeling_fx_utils.py to utils/fx.py for clarity #12326 (@michaelbenayoun)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v4.8.0 Integration with the Hub and Flax/JAX support

v4.8.0 Integration with the Hub and Flax/JAX support

Integration with the Hub

Trainer Hub integration

Visualizing Training metrics on huggingface.co (based on Tensorboard)

Model card generation

Model, tokenizer and configurations

Flax/JAX support

General improvements and bug fixes