Skip to content

Commit

Permalink
Model Card metadata documentation update (#1116)
Browse files Browse the repository at this point in the history
Co-authored-by: Omar Sanseviero <osanseviero@gmail.com>
  • Loading branch information
davanstrien and osanseviero authored Nov 24, 2023
1 parent 3a9eea9 commit 5eabb67
Show file tree
Hide file tree
Showing 2 changed files with 81 additions and 13 deletions.
92 changes: 80 additions & 12 deletions docs/hub/model-cards.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,41 @@ The model card template is available [here](https://github.com/huggingface/huggi

## Model card metadata

A model repo will render its `README.md` as a model card. To control how the Hub displays the card, you should create a YAML section in the README file to define some metadata. Start by adding three `---` at the top, then include all of the relevant metadata, and close the section with another group of `---` like the example below:
A model repo will render its `README.md` as a model card. The model card is a [Markdown](https://en.wikipedia.org/wiki/Markdown) file, with a [YAML](https://en.wikipedia.org/wiki/YAML) section at the top that contains metadata about the model.

The metadata you add to the model card supports discovery and easier use of your model. For example:

* Allowing users to filter models at https://huggingface.co/models.
* Displaying the model's license.
* Adding datasets to the metadata will add a message reading `Datasets used to train:` to your model card and link the relevant datasets, if they're available on the Hub.

Dataset, metric, and language identifiers are those listed on the [Datasets](https://huggingface.co/datasets), [Metrics](https://huggingface.co/metrics) and [Languages](https://huggingface.co/languages) pages.


### Adding metadata to your model card

There are a few different ways to add metadata to your model card including:
- Using the metadata UI
- Directly editing the YAML section of the `README.md` file
- Via the [`huggingface_hub`](https://huggingface.co/docs/huggingface_hub) Python library, see the [docs](https://huggingface.co/docs/huggingface_hub/guides/model-cards#update-metadata) for more details.

Many libraries with [Hub integration](./models-libraries) will automatically add metadata to the model card when you upload a model.

#### Using the metadata UI

You can add metadata to your model card using the metadata UI. To access the metadata UI, go to the model page and click on the `Edit model card` button in the top right corner of the model card. This will open an editor showing the model card `README.md` file, as well as a UI for editing the metadata.


<div class="flex justify-center">
<img class="block dark:hidden" src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/hub/metadata-ui-editor.png"/>
<img class="hidden dark:block" src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/hub/metadata-ui-editor-dark.png"/>
</div>

This UI will allow you to add key metadata to your model card and many of the fields will autocomplete based on the information you provide. Using the UI is the easiest way to add metadata to your model card, but it doesn't support all of the metadata fields. If you want to add metadata that isn't supported by the UI, you can edit the YAML section of the `README.md` file directly.

#### Editing the YAML section of the `README.md` file

You can also directly edit the YAML section of the `README.md` file. If the model card doesn't already have a YAML section, you can add one by adding three `---` at the top of the file, then include all of the relevant metadata, and close the section with another group of `---` like the example below:

```yaml
---
Expand All @@ -40,23 +74,17 @@ datasets:
metrics:
- metric1
- metric2
base_model: "base model Hub identifier"
---
```

The metadata that you add to the model card enables certain interactions on the Hub. For example:
* Allow users to filter and discover models at https://huggingface.co/models.
* If you choose a license using the keywords listed in the right column of [this table](./repositories-licenses), the license will be displayed on the model page.
* Adding datasets to the metadata will add a message reading `Datasets used to train:` to your model card and link the relevant datasets, if they're available on the Hub.

Dataset, metric, and language identifiers are those listed on the [Datasets](https://huggingface.co/datasets), [Metrics](https://huggingface.co/metrics) and [Languages](https://huggingface.co/languages) pages and in the [`datasets`](https://github.com/huggingface/datasets) repository.

See the detailed model card metadata specification [here](https://github.com/huggingface/hub-docs/blob/main/modelcard.md?plain=1).
You can find the detailed model card metadata specification <a href="https://github.com/huggingface/hub-docs/blob/main/modelcard.md?plain=1" target="_blank">here</a>.

### Specifying a library

You can also specify the supported libraries in the model card metadata section. Find more about our supported libraries [here](./models-libraries). The library can be specified in the following order of priority
You can specify the supported libraries in the model card metadata section. Find more about our supported libraries [here](./models-libraries). The library will be specified in the following order of priority:

1. Specifying `library_name` in the model card (recommended if your model is not a `transformers` model)
1. Specifying `library_name` in the model card (recommended if your model is not a `transformers` model). This information can be added via the metadata UI or directly in the model card YAML section:

```yaml
library_name: flair
Expand All @@ -69,7 +97,7 @@ tags:
- flair
```
If it's not specified, the Hub will try to automatically detect the library type. Unless your model is from `transformers`, this approach is discouraged and repo creators should use the explicit `library_name` as much as possible.
If it's not specified, the Hub will try to automatically detect the library type. Unless your model is from `transformers`, this approach is discouraged and repo creators should use the explicit `library_name` as much as possible.

1. By looking into the presence of files such as `*.nemo` or `*saved_model.pb*`, the Hub can determine if a model is from NeMo or Keras.
2. If nothing is detected and there is a `config.json` file, it's assumed the library is `transformers`.
Expand All @@ -89,8 +117,38 @@ This metadata will be used to display the base model on the model page. Users ca
<img class="hidden dark:block" src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/hub/base-model-ui-dark.png"/>
</div>

### Specifying a dataset

You can specify the datasets used to train your model in the model card metadata section. The datasets will be displayed on the model page and users will be able to filter models by dataset. You should use the Hub dataset identifier, which is the same as the dataset's repo name as the identifier:

```yaml
datasets:
- imdb
- HuggingFaceH4/no_robots
```

### Specifying a task (`pipeline_tag`)

You can specify the `pipeline_tag` in the model card metadata. The `pipeline_tag` indicates the type of task the model is intended for. This tag will be displayed on the model page and users can filter models on the Hub by task. This tag is also used to determine which [widget](./models-widgets.md#enabling-a-widget) to use for the model and which APIs to use under the hood.

For `transformers` models, the pipeline tag is automatically inferred from the model's `config.json` file but you can override it in the model card metadata if required. Editing this field in the metadata UI will ensure that the pipeline tag is valid. Some other libraries with Hub integration will also automatically add the pipeline tag to the model card metadata.

### Specifying a license

You can specify the license in the model card metadata section. The license will be displayed on the model page and users will be able to filter models by license. Using the metadata UI, you will see a dropdown of the most common licenses.

If required, you can also specify a custom license by adding `other` as the license value and specifying the name and a link to the license in the metadata.

```yaml
# Example from https://huggingface.co/coqui/XTTS-v1
---
license: other
license_name: coqui-public-model-license
license_link: https://coqui.ai/cpml
---
```

If the license is not available via a URL you can link to a LICENSE stored in the model repo.

### Evaluation Results

Expand Down Expand Up @@ -144,6 +202,14 @@ Read more about Paper pages [here](./paper-pages).

Each model page lists all the model's tags in the page header, below the model name. These are primarily computed from the model card metadata, although some are added automatically, as described in [Creating a Widget](./models-widgets#creating-a-widget).

### Can I add custom tags to my model?

Yes, you can add custom tags to your model by adding them to the `tags` field in the model card metadata. The metadata UI will suggest some popular tags, but you can add any tag you want. For example, you could indicate that your model is focused on finance by adding a `finance` tag.

### How can I indicate that my model is not suitable for all audiences

You can add a `not-for-all-audience` tag to your model card metadata. When this tag is present, a message will be displayed on the model page indicating that the model is not for all audiences. Users can click through this message to view the model card.

### Can I write LaTeX in my model card?

Yes! The Hub uses the [KaTeX](https://katex.org/) math typesetting library to render math formulas server-side before parsing the Markdown.
Expand All @@ -163,3 +229,5 @@ $$
$$

$$ E=mc^2 $$


2 changes: 1 addition & 1 deletion docs/hub/models-faq.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

## How can I see what dataset was used to train the model?

It's up to the person who uploaded the model to include the training information! You may find the information about the datasets that the model was trained on in the model card. If the datasets used for the model are on the Hub, the uploader may have included them in the [model card's metadata](https://huggingface.co/Jiva/xlm-roberta-large-it-mnli/blob/main/README.md#L7-L9). In that case, the datasets would be linked with a handy card on the right side of the model page:
It's up to the person who uploaded the model to include the training information! A user can [specify](./model-cards#specifying-a-dataset) the dataset used for training a model. If the datasets used for the model are on the Hub, the uploader may have included them in the [model card's metadata](https://huggingface.co/Jiva/xlm-roberta-large-it-mnli/blob/main/README.md#L7-L9). In that case, the datasets would be linked with a handy card on the right side of the model page:

<div class="flex justify-center">
<img class="block dark:hidden" width="350" src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/hub/models-linked-datasets.png"/>
Expand Down

0 comments on commit 5eabb67

Please sign in to comment.