The Hugging Face Hub makes hosting and sharing models with the community easy. It supports
dozens of libraries in the Open Source ecosystem. We are always
working on expanding this support to push collaborative Machine Learning forward. The huggingface_hub
library plays a
key role in this process, allowing any Python script to easily push and load files.
There are four main ways to integrate a library with the Hub:
- Push to Hub: implement a method to upload a model to the Hub. This includes the model weights, as well as
the model card and any other relevant information
or data necessary to run the model (for example, training logs). This method is often called
push_to_hub()
. - Download from Hub: implement a method to load a model from the Hub. The method should download the model
configuration/weights and load the model. This method is often called
from_pretrained
orload_from_hub()
. - Inference API: use our servers to run inference on models supported by your library for free.
- Widgets: display a widget on the landing page of your models on the Hub. It allows users to quickly try a model from the browser.
In this guide, we will focus on the first two topics. We will present the two main approaches you can use to integrate a library, with their advantages and drawbacks. Everything is summarized at the end of the guide to help you choose between the two. Please keep in mind that these are only guidelines that you are free to adapt to you requirements.
If you are interested in Inference and Widgets, you can follow this guide. In both cases, you can reach out to us if you are integrating a library with the Hub and want to be listed in our docs.
The first approach to integrate a library to the Hub is to actually implement the push_to_hub
and from_pretrained
methods by yourself. This gives you full flexibility on which files you need to upload/download and how to handle inputs
specific to your framework. You can refer to the two upload files and download files guides
to learn more about how to do that. This is, for example how the FastAI integration is implemented (see [push_to_hub_fastai
]
and [from_pretrained_fastai
]).
Implementation can differ between libraries, but the workflow is often similar.
This is how a from_pretrained
method usually looks like:
def from_pretrained(model_id: str) -> MyModelClass:
# Download model from Hub
cached_model = hf_hub_download(
repo_id=repo_id,
filename="model.pkl",
library_name="fastai",
library_version=get_fastai_version(),
)
# Load model
return load_model(cached_model)
The push_to_hub
method often requires a bit more complexity to handle repo creation, generate the model card and save weights.
A common approach is to save all of these files in a temporary folder, upload it and then delete it.
def push_to_hub(model: MyModelClass, repo_name: str) -> None:
api = HfApi()
# Create repo if not existing yet and get the associated repo_id
repo_id = api.create_repo(repo_name, exist_ok=True)
# Save all files in a temporary directory and push them in a single commit
with TemporaryDirectory() as tmpdir:
tmpdir = Path(tmpdir)
# Save weights
save_model(model, tmpdir / "model.safetensors")
# Generate model card
card = generate_model_card(model)
(tmpdir / "README.md").write_text(card)
# Save logs
# Save figures
# Save evaluation metrics
# ...
# Push to hub
return api.upload_folder(repo_id=repo_id, folder_path=tmpdir)
This is of course only an example. If you are interested in more complex manipulations (delete remote files, upload weights on the fly, persist weights locally, etc.) please refer to the upload files guide.
While being flexible, this approach has some drawbacks, especially in terms of maintenance. Hugging Face users are often
used to additional features when working with huggingface_hub
. For example, when loading files from the Hub, it is
common to offer parameters like:
token
: to download from a private reporevision
: to download from a specific branchcache_dir
: to cache files in a specific directoryforce_download
/local_files_only
: to reuse the cache or notproxies
: configure HTTP session
When pushing models, similar parameters are supported:
commit_message
: custom commit messageprivate
: create a private repo if missingcreate_pr
: create a PR instead of pushing tomain
branch
: push to a branch instead of themain
branchallow_patterns
/ignore_patterns
: filter which files to uploadtoken
- ...
All of these parameters can be added to the implementations we saw above and passed to the huggingface_hub
methods.
However, if a parameter changes or a new feature is added, you will need to update your package. Supporting those
parameters also means more documentation to maintain on your side. To see how to mitigate these limitations, let's jump
to our next section class inheritance.
As we saw above, there are two main methods to include in your library to integrate it with the Hub: upload files
(push_to_hub
) and download files (from_pretrained
). You can implement those methods by yourself but it comes with
caveats. To tackle this, huggingface_hub
provides a tool that uses class inheritance. Let's see how it works!
In a lot of cases, a library already implements its model using a Python class. The class contains the properties of
the model and methods to load, run, train, and evaluate it. Our approach is to extend this class to include upload and
download features using mixins. A Mixin is a class that is meant to extend an
existing class with a set of specific features using multiple inheritance. huggingface_hub
provides its own mixin,
the [ModelHubMixin
]. The key here is to understand its behavior and how to customize it.
The [ModelHubMixin
] class implements 3 public methods (push_to_hub
, save_pretrained
and from_pretrained
). Those
are the methods that your users will call to load/save models with your library. [ModelHubMixin
] also defines 2
private methods (_save_pretrained
and _from_pretrained
). Those are the ones you must implement. So to integrate
your library, you should:
- Make your Model class inherit from [
ModelHubMixin
]. - Implement the private methods:
- [
~ModelHubMixin._save_pretrained
]: method taking as input a path to a directory and saving the model to it. You must write all the logic to dump your model in this method: model card, model weights, configuration files, training logs, and figures. Any relevant information for this model must be handled by this method. Model Cards are particularly important to describe your model. Check out our implementation guide for more details. - [
~ModelHubMixin._from_pretrained
]: class method taking as input amodel_id
and returning an instantiated model. The method must download the relevant files and load them.
- [
- You are done!
The advantage of using [ModelHubMixin
] is that once you take care of the serialization/loading of the files, you are ready to go. You don't need to worry about stuff like repo creation, commits, PRs, or revisions. The [ModelHubMixin
] also ensures public methods are documented and type annotated, and you'll be able to view your model's download count on the Hub. All of this is handled by the [ModelHubMixin
] and available to your users.
A good example of what we saw above is [PyTorchModelHubMixin
], our integration for the PyTorch framework. This is a ready-to-use integration.
Here is how any user can load/save a PyTorch model from/to the Hub:
>>> import torch
>>> import torch.nn as nn
>>> from huggingface_hub import PyTorchModelHubMixin
# Define your Pytorch model exactly the same way you are used to
>>> class MyModel(
... nn.Module,
... PyTorchModelHubMixin, # multiple inheritance
... library_name="keras-nlp",
... tags=["keras"],
... repo_url="https://github.com/keras-team/keras-nlp",
... docs_url="https://keras.io/keras_nlp/",
... # ^ optional metadata to generate model card
... ):
... def __init__(self, hidden_size: int = 512, vocab_size: int = 30000, output_size: int = 4):
... super().__init__()
... self.param = nn.Parameter(torch.rand(hidden_size, vocab_size))
... self.linear = nn.Linear(output_size, vocab_size)
... def forward(self, x):
... return self.linear(x + self.param)
# 1. Create model
>>> model = MyModel(hidden_size=128)
# Config is automatically created based on input + default values
>>> model.param.shape[0]
128
# 2. (optional) Save model to local directory
>>> model.save_pretrained("path/to/my-awesome-model")
# 3. Push model weights to the Hub
>>> model.push_to_hub("my-awesome-model")
# 4. Initialize model from the Hub => config has been preserved
>>> model = MyModel.from_pretrained("username/my-awesome-model")
>>> model.param.shape[0]
128
# Model card has been correctly populated
>>> from huggingface_hub import ModelCard
>>> card = ModelCard.load("username/my-awesome-model")
>>> card.data.tags
["keras", "pytorch_model_hub_mixin", "model_hub_mixin"]
>>> card.data.library_name
"keras-nlp"
The implementation is actually very straightforward, and the full implementation can be found here.
- First, inherit your class from
ModelHubMixin
:
from huggingface_hub import ModelHubMixin
class PyTorchModelHubMixin(ModelHubMixin):
(...)
- Implement the
_save_pretrained
method:
from huggingface_hub import ModelHubMixin
class PyTorchModelHubMixin(ModelHubMixin):
(...)
def _save_pretrained(self, save_directory: Path) -> None:
"""Save weights from a Pytorch model to a local directory."""
save_model_as_safetensor(self.module, str(save_directory / SAFETENSORS_SINGLE_FILE))
- Implement the
_from_pretrained
method:
class PyTorchModelHubMixin(ModelHubMixin):
(...)
@classmethod # Must be a classmethod!
def _from_pretrained(
cls,
*,
model_id: str,
revision: str,
cache_dir: str,
force_download: bool,
proxies: Optional[Dict],
resume_download: bool,
local_files_only: bool,
token: Union[str, bool, None],
map_location: str = "cpu", # additional argument
strict: bool = False, # additional argument
**model_kwargs,
):
"""Load Pytorch pretrained weights and return the loaded model."""
model = cls(**model_kwargs)
if os.path.isdir(model_id):
print("Loading weights from local directory")
model_file = os.path.join(model_id, SAFETENSORS_SINGLE_FILE)
return cls._load_as_safetensor(model, model_file, map_location, strict)
model_file = hf_hub_download(
repo_id=model_id,
filename=SAFETENSORS_SINGLE_FILE,
revision=revision,
cache_dir=cache_dir,
force_download=force_download,
proxies=proxies,
resume_download=resume_download,
token=token,
local_files_only=local_files_only,
)
return cls._load_as_safetensor(model, model_file, map_location, strict)
And that's it! Your library now enables users to upload and download files to and from the Hub.
In the section above, we quickly discussed how the [ModelHubMixin
] works. In this section, we will see some of its more advanced features to improve your library integration with the Hugging Face Hub.
[ModelHubMixin
] generates the model card for you. Model cards are files that accompany the models and provide important information about them. Under the hood, model cards are simple Markdown files with additional metadata. Model cards are essential for discoverability, reproducibility, and sharing! Check out the Model Cards guide for more details.
Generating model cards semi-automatically is a good way to ensure that all models pushed with your library will share common metadata: library_name
, tags
, license
, pipeline_tag
, etc. This makes all models backed by your library easily searchable on the Hub and provides some resource links for users landing on the Hub. You can define the metadata directly when inheriting from [ModelHubMixin
]:
class UniDepthV1(
nn.Module,
PyTorchModelHubMixin,
library_name="unidepth",
repo_url="https://github.com/lpiccinelli-eth/UniDepth",
docs_url=...,
pipeline_tag="depth-estimation",
license="cc-by-nc-4.0",
tags=["monocular-metric-depth-estimation", "arxiv:1234.56789"]
):
...
By default, a generic model card will be generated with the info you've provided (example: pyp1/VoiceCraft_giga830M). But you can define your own model card template as well!
In this example, all models pushed with the VoiceCraft
class will automatically include a citation section and license details. For more details on how to define a model card template, please check the Model Cards guide.
MODEL_CARD_TEMPLATE = """
---
# For reference on model card metadata, see the spec: https://github.com/huggingface/hub-docs/blob/main/modelcard.md?plain=1
# Doc / guide: https://huggingface.co/docs/hub/model-cards
{{ card_data }}
---
This is a VoiceCraft model. For more details, please check out the official Github repo: https://github.com/jasonppy/VoiceCraft. This model is shared under a Attribution-NonCommercial-ShareAlike 4.0 International license.
## Citation
@article{peng2024voicecraft,
author = {Peng, Puyuan and Huang, Po-Yao and Li, Daniel and Mohamed, Abdelrahman and Harwath, David},
title = {VoiceCraft: Zero-Shot Speech Editing and Text-to-Speech in the Wild},
journal = {arXiv},
year = {2024},
}
"""
class VoiceCraft(
nn.Module,
PyTorchModelHubMixin,
library_name="voicecraft",
model_card_template=MODEL_CARD_TEMPLATE,
...
):
...
Finally, if you want to extend the model card generation process with dynamic values, you can override the [~ModelHubMixin.generate_model_card
] method:
from huggingface_hub import ModelCard, PyTorchModelHubMixin
class UniDepthV1(nn.Module, PyTorchModelHubMixin, ...):
(...)
def generate_model_card(self, *args, **kwargs) -> ModelCard:
card = super().generate_model_card(*args, **kwargs)
card.data.metrics = ... # add metrics to the metadata
card.text += ... # append section to the modelcard
return card
[ModelHubMixin
] handles the model configuration for you. It automatically checks the input values when you instantiate the model and serializes them in a config.json
file. This provides 2 benefits:
- Users will be able to reload the model with the exact same parameters as you.
- Having a
config.json
file automatically enables analytics on the Hub (i.e. the "downloads" count).
But how does it work in practice? Several rules make the process as smooth as possible from a user perspective:
- if your
__init__
method expects aconfig
input, it will be automatically saved in the repo asconfig.json
. - if the
config
input parameter is annotated with a dataclass type (e.g.config: Optional[MyConfigClass] = None
), then theconfig
value will be correctly deserialized for you. - all values passed at initialization will also be stored in the config file. This means you don't necessarily have to expect a
config
input to benefit from it.
Example:
class MyModel(ModelHubMixin):
def __init__(value: str, size: int = 3):
self.value = value
self.size = size
(...) # implement _save_pretrained / _from_pretrained
model = MyModel(value="my_value")
model.save_pretrained(...)
# config.json contains passed and default values
{"value": "my_value", "size": 3}
But what if a value cannot be serialized as JSON? By default, the value will be ignored when saving the config file. However, in some cases your library already expects a custom object as input that cannot be serialized, and you don't want to update your internal logic to update its type. No worries! You can pass custom encoders/decoders for any type when inheriting from [ModelHubMixin
]. This is a bit more work but ensures your internal logic is untouched when integrating your library with the Hub.
Here is a concrete example where a class expects a argparse.Namespace
config as input:
class VoiceCraft(nn.Module):
def __init__(self, args):
self.pattern = self.args.pattern
self.hidden_size = self.args.hidden_size
...
One solution can be to update the __init__
signature to def __init__(self, pattern: str, hidden_size: int)
and update all snippets that instantiate your class. This is a perfectly valid way to fix it but it might break downstream applications using your library.
Another solution is to provide a simple encoder/decoder to convert argparse.Namespace
to a dictionary.
from argparse import Namespace
class VoiceCraft(
nn.Module,
PyTorchModelHubMixin, # inherit from mixin
coders={
Namespace : (
lambda x: vars(x), # Encoder: how to convert a `Namespace` to a valid jsonable value?
lambda data: Namespace(**data), # Decoder: how to reconstruct a `Namespace` from a dictionary?
)
}
):
def __init__(self, args: Namespace): # annotate `args`
self.pattern = self.args.pattern
self.hidden_size = self.args.hidden_size
...
In the snippet above, both the internal logic and the __init__
signature of the class did not change. This means all existing code snippets for your library will continue to work. To achieve this, we had to:
- Inherit from the mixin (
PytorchModelHubMixin
in this case). - Pass a
coders
parameter in the inheritance. This is a dictionary where keys are custom types you want to process. Values are a tuple(encoder, decoder)
.- The encoder expects an object of the specified type as input and returns a jsonable value. This will be used when saving a model with
save_pretrained
. - The decoder expects raw data (typically a dictionary) as input and reconstructs the initial object. This will be used when loading the model with
from_pretrained
.
- The encoder expects an object of the specified type as input and returns a jsonable value. This will be used when saving a model with
- Add a type annotation to the
__init__
signature. This is important to let the mixin know which type is expected by the class and, therefore, which decoder to use.
For the sake of simplicity, the encoder/decoder functions in the example above are not robust. For a concrete implementation, you would most likely have to handle corner cases properly.
Let's quickly sum up the two approaches we saw with their advantages and drawbacks. The table below is only indicative. Your framework might have some specificities that you need to address. This guide is only here to give guidelines and ideas on how to handle integration. In any case, feel free to contact us if you have any questions!
Integration | Using helpers | Using [ModelHubMixin ] |
---|---|---|
User experience | model = load_from_hub(...) push_to_hub(model, ...) |
model = MyModel.from_pretrained(...) model.push_to_hub(...) |
Flexibility | Very flexible. You fully control the implementation. |
Less flexible. Your framework must have a model class. |
Maintenance | More maintenance to add support for configuration, and new features. Might also require fixing issues reported by users. | Less maintenance as most of the interactions with the Hub are implemented in huggingface_hub . |
Documentation / Type annotation | To be written manually. | Partially handled by huggingface_hub . |
Download counter | To be handled manually. | Enabled by default if class has a config attribute. |
Model card | To be handled manually | Generated by default with library_name, tags, etc. |