-
Notifications
You must be signed in to change notification settings - Fork 263
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update NeMo's countDownloads #1090
base: main
Are you sure you want to change the base?
Update NeMo's countDownloads #1090
Conversation
In this case, since the main weights are transformers, I would use |
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
Thank you @pcuenca! I applied your suggestion commit, and I've let my Minitron colleagues know your suggestion. |
Hi @Haoxiang-Wang! It's not just about the download stats. Having the correct library name and metadata helps the community immediately understand what the model is about, makes it possible for the models to appear in searches, unblocks features such as code snippets (this is what's currently displayed for one of the models above), shows instructions to use the models in local apps such as vLLM and others, allows deployment through Inference Endpoints, etc. Distribution through the Hub can be greatly improved with a complete Hub integration that leverages all the features in the HF ecosystem and unblocks other downstream uses. This all relies on the metadata fields. I think it'd be much better for these models to follow these practices to ensure they gather all the visibility they can possibly get. Please, let us know if we can help make this happen. If there's a pressing need to retrieve download stats for some of these models, we can run an ad-hoc query for you. In addition, please note that download stats are not lost while the metadata is incorrect – the correct numbers will appear once it's fixed. |
Thank you for the clarifications! @pcuenca In this case, is it possible to allow one model repo to be attached to two libraries? For NVIDIA Models, we prefer to put both the NeMo version and HuggingFace version into the same repo. Also I noticed that Llama and Mistral did a similar thing: Llama-3.1-8B put the HuggingFace model weights in the root and Llama-format weights in I think it's more accurate to record the download statistics of both versions. If huggingface.js allows attaching multiple libraries, that can easily resolve this issue. |
Besides, I think users should be aware if a model repo is compatible with two libraries. So making |
It is not possible to assign a list of values to The recommended way to deal with multiple versions of the checkpoints (as opposed to the same weights that work with different libraries) is to use separate repos. For example, PaliGemma 2 transformers vs big_vision. |
Many model repos that are tagged with "library_name: nemo" actually also have weights under the format of HuggingFace
transformers
, such as https://huggingface.co/nvidia/Minitron-4B-Base and https://huggingface.co/nvidia/Llama-3.1-Minitron-4B-Width-BaseIt's inaccurate to only track downloads of ".nemo" and "model_config.yaml" files for them. This PR adds
config.json
to track the downloads of theirtransformers
version of weights.