Update NeMo's countDownloads #1090

Haoxiang-Wang · 2025-01-08T04:49:26Z

Many model repos that are tagged with "library_name: nemo" actually also have weights under the format of HuggingFace transformers, such as https://huggingface.co/nvidia/Minitron-4B-Base and https://huggingface.co/nvidia/Llama-3.1-Minitron-4B-Width-Base

It's inaccurate to only track downloads of ".nemo" and "model_config.yaml" files for them. This PR adds config.json to track the downloads of their transformers version of weights.

packages/tasks/src/model-libraries.ts

pcuenca · 2025-01-08T08:17:59Z

In this case, since the main weights are transformers, I would use library_name: transformers and add nemo as a tag.

pcuenca · 2025-01-08T08:37:05Z

For example: https://huggingface.co/nvidia/Minitron-4B-Base/discussions/8, https://huggingface.co/nvidia/Llama-3.1-Minitron-4B-Width-Base/discussions/14

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

Haoxiang-Wang · 2025-01-08T10:07:25Z

Thank you @pcuenca! I applied your suggestion commit, and I've let my Minitron colleagues know your suggestion.
Please merge this MR if possible. Although I've let some NeMo colleagues know this issue. I am sure there are more people who don't know this when they upload both NeMo and transformers weights.

pcuenca · 2025-01-08T16:05:17Z

Hi @Haoxiang-Wang! It's not just about the download stats. Having the correct library name and metadata helps the community immediately understand what the model is about, makes it possible for the models to appear in searches, unblocks features such as code snippets (this is what's currently displayed for one of the models above), shows instructions to use the models in local apps such as vLLM and others, allows deployment through Inference Endpoints, etc. Distribution through the Hub can be greatly improved with a complete Hub integration that leverages all the features in the HF ecosystem and unblocks other downstream uses. This all relies on the metadata fields.

I think it'd be much better for these models to follow these practices to ensure they gather all the visibility they can possibly get. Please, let us know if we can help make this happen.

If there's a pressing need to retrieve download stats for some of these models, we can run an ad-hoc query for you. In addition, please note that download stats are not lost while the metadata is incorrect – the correct numbers will appear once it's fixed.

Haoxiang-Wang · 2025-01-08T18:49:46Z

Thank you for the clarifications! @pcuenca

In this case, is it possible to allow one model repo to be attached to two libraries?

For NVIDIA Models, we prefer to put both the NeMo version and HuggingFace version into the same repo. Also I noticed that Llama and Mistral did a similar thing: Llama-3.1-8B put the HuggingFace model weights in the root and Llama-format weights in original/; Ministral-8B puts both the HuggingFace and Mistral version of the model weights in their root.

I think it's more accurate to record the download statistics of both versions. If huggingface.js allows attaching multiple libraries, that can easily resolve this issue.

Haoxiang-Wang · 2025-01-08T18:53:21Z

Besides, I think users should be aware if a model repo is compatible with two libraries. So making library_name compatible with a list of library names could be a good idea.

pcuenca · 2025-01-09T11:55:19Z

It is not possible to assign a list of values to library_name, we use it for the main library, and tags for other compatible libraries.

The recommended way to deal with multiple versions of the checkpoints (as opposed to the same weights that work with different libraries) is to use separate repos. For example, PaliGemma 2 transformers vs big_vision.

Wauplin · 2025-03-24T15:22:07Z

Is this PR still relevant / ready for review?

Update NeMo's countDownloads

789b29f

Haoxiang-Wang requested review from SBrandeis, gary149, Wauplin, julien-c, pcuenca and ngxson as code owners January 8, 2025 04:49

pcuenca reviewed Jan 8, 2025

View reviewed changes

packages/tasks/src/model-libraries.ts Outdated Show resolved Hide resolved

Update packages/tasks/src/model-libraries.ts

7a53ee5

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

pcuenca mentioned this pull request Jan 13, 2025

Add cosmos library #1089

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Update NeMo's countDownloads #1090

Update NeMo's countDownloads #1090

Uh oh!

Haoxiang-Wang commented Jan 8, 2025

Uh oh!

Uh oh!

pcuenca commented Jan 8, 2025

Uh oh!

pcuenca commented Jan 8, 2025

Uh oh!

Haoxiang-Wang commented Jan 8, 2025

Uh oh!

pcuenca commented Jan 8, 2025

Uh oh!

Haoxiang-Wang commented Jan 8, 2025

Uh oh!

Haoxiang-Wang commented Jan 8, 2025

Uh oh!

pcuenca commented Jan 9, 2025

Uh oh!

Wauplin commented Mar 24, 2025

Uh oh!

Uh oh!

Update NeMo's countDownloads #1090

Are you sure you want to change the base?

Update NeMo's countDownloads #1090

Uh oh!

Conversation

Haoxiang-Wang commented Jan 8, 2025

Uh oh!

Uh oh!

pcuenca commented Jan 8, 2025

Uh oh!

pcuenca commented Jan 8, 2025

Uh oh!

Haoxiang-Wang commented Jan 8, 2025

Uh oh!

pcuenca commented Jan 8, 2025

Uh oh!

Haoxiang-Wang commented Jan 8, 2025

Uh oh!

Haoxiang-Wang commented Jan 8, 2025

Uh oh!

pcuenca commented Jan 9, 2025

Uh oh!

Wauplin commented Mar 24, 2025

Uh oh!

Uh oh!