Removed GeneralTags, ModelTags and DatasetTags (#1761)

* removed tags from endpoint tests * removed tags from endpoints * removed tags from hf_api * removed tags from docstrings in endpoint_helpers * removed tags from hf_api * removed model search argument from test_hf_api * removed ModelSearchArguments and DataSearchArguments * removed DatasetSearchArguments and ModelSearchArguments * removed DatasetSearchArguments and * removed ModelSearchArguments and DatasetSearchArguments from the docs * Revert "removed DatasetSearchArguments and" This reverts commit ce6b91b. * removed tags from __init__.py * ran make style * Removed ## How to explore filter options ? section * Revert "removed tags from __init__.py" This reverts commit ad1a31c. Reverting removal get_dataset_tags and get_model_tags for comment 2 * Revert "removed DatasetSearchArguments and ModelSearchArguments" This reverts commit fbf6dd0. * Revert "removed tags from __init__.py" This reverts commit ad1a31c. * Revert "removed tags from hf_api" This reverts commit 2cefee1. * Revert "removed tags from hf_api" This reverts commit dd3b8f1. * Removed attribute dictionary from imports and removed model search argument class * Complete removed class AttributeDictionary(dict): * Removed attribute dictionary tests * Updating ModelTags and DatasetTags so that they just return the raw dictionary * Removed final DatasetTags import * Removed 'ModelTags' import * Ran make style * fix: remove useless token (#1765) * Retry on ConnectionError/ReadTimeout when streaming file from server (#1766) * Retry on ConnectionError/ReadTimeout when streaming file from server * add test * fix testing utils * Adding `InferenceClient.get_recommended_model` (#1770) * Moved logger info to InferenceClient, so get_recommended_model function can bypass that * Added get_recommended_model to InferenceClient * Ran make style to generate the async client * Added tests of get_recommended_model * Update src/huggingface_hub/inference/_client.py Co-authored-by: Lucain <lucainp@gmail.com> * Fixed ordering of logger info and _get_recommended_model, for model string to have been populated * Removed _get_recommended_model private function, in favor of get_recommended_model in InferenceClient * Fixed wording of ValueError to use 'model' not 'task' * Ran make style for AsyncInferenceClient --------- Co-authored-by: Lucain <lucainp@gmail.com> * Fix document link for manage-cache (#1774) * Fix document link for manage-cache * Use redirects in _redirects.yml * Update docs/source/en/package_reference/file_download.md --------- Co-authored-by: Lucain <lucainp@gmail.com> * Minor doc fixes (#1775) * Don't use `api` in `list_repo_refs` example. * Minor typo fssepc -> fsspec * Use `.item_object_id` instead of `._id` * Ran make style --------- Co-authored-by: Remy <remy@huggingface.co> Co-authored-by: Lucain <lucainp@gmail.com> Co-authored-by: James Braza <jamesbraza@gmail.com> Co-authored-by: liuxueyang <liuxueyang457@gmail.com> Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
huggingface · Oct 26, 2023 · 0a2503d · 0a2503d
1 parent ef48c7f
commit 0a2503d
Show file tree

Hide file tree

Showing 8 changed files with 10 additions and 791 deletions.
diff --git a/docs/source/de/guides/search.md b/docs/source/de/guides/search.md
@@ -60,164 +60,7 @@ Zum Beispiel holt das folgende Beispiel die 5 am häufigsten heruntergeladenen D
 ```
 
 
-## Wie erkundet man Filteroptionen?
 
-Jetzt wissen Sie, wie Sie Ihre Liste von Modellen/Datensätzen/Räumen filtern können. 
-Das Problem könnte sein, dass Sie nicht genau wissen, wonach Sie suchen. Keine Sorge! 
-Wir bieten auch einige Hilfsprogramme an, mit denen Sie entdecken können, welche Argumente Sie in Ihrer Abfrage übergeben können.
-
-[`ModelSearchArguments`] und [`DatasetSearchArguments`] sind geschachtelte Namespace-Objekte, 
-die **jede einzelne Option** auf dem Hub haben und die zurückgeben, was an `filter` übergeben werden sollte. 
-Das Beste von allem ist: Es hat Tab-Vervollständigung 🎊.
-
-```python
->>> from huggingface_hub import ModelSearchArguments, DatasetSearchArguments
-
->>> model_args = ModelSearchArguments()
->>> dataset_args = DatasetSearchArguments()
-```
-
-<Tip warning={true}>
-
-Bevor Sie weitermachen, beachten Sie bitte, dass [`ModelSearchArguments`] und [`DatasetSearchArguments`] 
-veraltete Hilfsprogramme sind, die nur zu Erkundungszwecken gedacht sind. 
-Ihre Initialisierung erfordert das Auflisten aller Modelle und Datensätze auf dem Hub, was sie zunehmend langsamer macht, 
-je mehr Repos auf dem Hub vorhanden sind. Für produktionsbereiten Code sollten Sie in Erwägung ziehen, 
-rohe Zeichenketten (raw strings) zu übergeben, wenn Sie eine gefilterte Suche auf dem Hub durchführen.
-
-</Tip>
-
-Sehen wir uns nun an, was in `model_args` verfügbar ist, indem wir seine Ausgabe überprüfen:
-
-```python
->>> model_args
-Available Attributes or Keys:
- * author
- * dataset
- * language
- * library
- * license
- * model_name
- * pipeline_tag
-```
-
-Es stehen Ihnen eine Vielzahl von Attributen oder Schlüsseln zur Verfügung. 
-Dies liegt daran, dass es sowohl ein Objekt als auch ein Wörterbuch ist. 
-Daher können Sie entweder `model_args["author"]` oder `model_args.author` verwenden.
-
-Das erste Kriterium besteht darin, alle PyTorch-Modelle zu erhalten. 
-Dies wäre unter dem Attribut `library` zu finden, schauen wir also, ob es da ist:
-
-```python
->>> model_args.library
-Available Attributes or Keys:
- * AdapterTransformers
- * Asteroid
- * ESPnet
- * Fairseq
- * Flair
- * JAX
- * Joblib
- * Keras
- * ONNX
- * PyTorch
- * Rust
- * Scikit_learn
- * SentenceTransformers
- * Stable_Baselines3 (Key only)
- * Stanza
- * TFLite
- * TensorBoard
- * TensorFlow
- * TensorFlowTTS
- * Timm
- * Transformers
- * allenNLP
- * fastText
- * fastai
- * pyannote_audio
- * spaCy
- * speechbrain
-```
-
-Es ist da! Der Name PyTorch ist vorhanden, daher müssen Sie `model_args.library.PyTorch` verwenden:
-
-```python
->>> model_args.library.PyTorch
-'pytorch'
-```
-
-Im Folgenden finden Sie eine Animation, die den Vorgang zur Suche nach den Anforderungen Textklassifizierung (`Text Classification`) and `glue` wiederholt:
-
-![Animation exploring `model_args.pipeline_tag`](https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/search_text_classification.gif)
-
-![Animation exploring `model_args.dataset`](https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/search_glue.gif)
-
-Jetzt, da alle Teile vorhanden sind, besteht der letzte Schritt darin, sie alle für etwas zu kombinieren, 
-das die API über die Klassen [`ModelFilter`] und [`DatasetFilter`] verwenden kann (d.h. Zeichenketten / strings).
-
-
-```python
->>> from huggingface_hub import ModelFilter, DatasetFilter
-
->>> filt = ModelFilter(
-...     task=model_args.pipeline_tag.TextClassification, 
-...     trained_dataset=dataset_args.dataset_name.glue, 
-...     library=model_args.library.PyTorch
-... )
->>> api.list_models(filter=filt)[0]
-ModelInfo: {
-	modelId: Jiva/xlm-roberta-large-it-mnli
-	sha: c6e64469ec4aa17fedbd1b2522256f90a90b5b86
-	lastModified: 2021-12-10T14:56:38.000Z
-	tags: ['pytorch', 'xlm-roberta', 'text-classification', 'it', 'dataset:multi_nli', 'dataset:glue', 'arxiv:1911.02116', 'transformers', 'tensorflow', 'license:mit', 'zero-shot-classification']
-	pipeline_tag: zero-shot-classification
-	siblings: [ModelFile(rfilename='.gitattributes'), ModelFile(rfilename='README.md'), ModelFile(rfilename='config.json'), ModelFile(rfilename='pytorch_model.bin'), ModelFile(rfilename='sentencepiece.bpe.model'), ModelFile(rfilename='special_tokens_map.json'), ModelFile(rfilename='tokenizer.json'), ModelFile(rfilename='tokenizer_config.json')]
-	config: None
-	id: Jiva/xlm-roberta-large-it-mnli
-	private: False
-	downloads: 11061
-	library_name: transformers
-	likes: 1
-}
-```
-
-Wie Sie sehen können, wurden die Modelle gefunden, die allen Kriterien entsprechen. Sie können es sogar noch weiter bringen, 
-indem Sie ein Array für jeden der vorherigen Parameter übergeben. 
-Zum Beispiel, schauen wir uns dieselbe Konfiguration an, aber schließen auch `TensorFlow` in den Filter ein:
-
-```python
->>> filt = ModelFilter(
-...     task=model_args.pipeline_tag.TextClassification, 
-...     library=[model_args.library.PyTorch, model_args.library.TensorFlow]
-... )
->>> api.list_models(filter=filt)[0]
-ModelInfo: {
-	modelId: distilbert-base-uncased-finetuned-sst-2-english
-	sha: ada5cc01a40ea664f0a490d0b5f88c97ab460470
-	lastModified: 2022-03-22T19:47:08.000Z
-	tags: ['pytorch', 'tf', 'rust', 'distilbert', 'text-classification', 'en', 'dataset:sst-2', 'transformers', 'license:apache-2.0', 'infinity_compatible']
-	pipeline_tag: text-classification
-	siblings: [ModelFile(rfilename='.gitattributes'), ModelFile(rfilename='README.md'), ModelFile(rfilename='config.json'), ModelFile(rfilename='map.jpeg'), ModelFile(rfilename='pytorch_model.bin'), ModelFile(rfilename='rust_model.ot'), ModelFile(rfilename='tf_model.h5'), ModelFile(rfilename='tokenizer_config.json'), ModelFile(rfilename='vocab.txt')]
-	config: None
-	id: distilbert-base-uncased-finetuned-sst-2-english
-	private: False
-	downloads: 3917525
-	library_name: transformers
-	likes: 49
-}
-```
-
-Diese Abfrage entspricht streng:
-
-```py
->>> filt = ModelFilter(
-...     task="text-classification", 
-...     library=["pytorch", "tensorflow"],
-... )
-```
-
-Hier war  [`ModelSearchArguments`] ein Helfer, um die auf dem Hub verfügbaren Optionen zu erkunden. 
-Es ist jedoch keine Voraussetzung für eine Suche. Eine andere Möglichkeit, dies zu tun, 
+Eine andere Möglichkeit, dies zu tun, 
 besteht darin, die [Modelle](https://huggingface.co/models) und [Datensätze](https://huggingface.co/datasets) Seiten 
 in Ihrem Browser zu besuchen, nach einigen Parametern zu suchen und die Werte in der URL anzusehen.
diff --git a/docs/source/en/guides/search.md b/docs/source/en/guides/search.md
@@ -60,164 +60,7 @@ the following example fetches the top 5 most downloaded datasets on the Hub:
 ```
 
 
-## How to explore filter options ?
 
-Now you know how to filter your list of models/datasets/spaces. The problem you might
-have is that you don't know exactly what you are looking for. No worries! We also provide
-some helpers that allows you to discover what arguments can be passed in your query.
-
-[`ModelSearchArguments`] and [`DatasetSearchArguments`] are nested namespace objects that
-have **every single option** available on the Hub and that will return what should be passed
-to `filter`. The best of all is: it has tab completion 🎊 .
-
-```python
->>> from huggingface_hub import ModelSearchArguments, DatasetSearchArguments
-
->>> model_args = ModelSearchArguments()
->>> dataset_args = DatasetSearchArguments()
-```
-
-<Tip warning={true}>
-
-Before continuing, please we aware that [`ModelSearchArguments`] and [`DatasetSearchArguments`]
-are legacy helpers meant for exploratory purposes only. Their initialization require listing
-all models and datasets on the Hub which makes them increasingly slower as the number of repos
-on the Hub increases. For some production-ready code, consider passing raw strings when making
-a filtered search on the Hub.
-
-</Tip>
-
-Now, let's check what is available in `model_args` by checking it's output, you will find:
-
-```python
->>> model_args
-Available Attributes or Keys:
- * author
- * dataset
- * language
- * library
- * license
- * model_name
- * pipeline_tag
-```
-
-It has a variety of attributes or keys available to you. This is because it is both an object
-and a dictionary, so you can either do `model_args["author"]` or `model_args.author`.
-
-The first criteria is getting all PyTorch models. This would be found under the `library` attribute, so let's see if it is there:
-
-```python
->>> model_args.library
-Available Attributes or Keys:
- * AdapterTransformers
- * Asteroid
- * ESPnet
- * Fairseq
- * Flair
- * JAX
- * Joblib
- * Keras
- * ONNX
- * PyTorch
- * Rust
- * Scikit_learn
- * SentenceTransformers
- * Stable_Baselines3 (Key only)
- * Stanza
- * TFLite
- * TensorBoard
- * TensorFlow
- * TensorFlowTTS
- * Timm
- * Transformers
- * allenNLP
- * fastText
- * fastai
- * pyannote_audio
- * spaCy
- * speechbrain
-```
-
-It is! The `PyTorch` name is there, so you'll need to use `model_args.library.PyTorch`:
-
-```python
->>> model_args.library.PyTorch
-'pytorch'
-```
-
-Below is an animation repeating the process for finding both the `Text Classification` and `glue` requirements:
-
-![Animation exploring `model_args.pipeline_tag`](https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/search_text_classification.gif)
-
-![Animation exploring `model_args.dataset`](https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/search_glue.gif)
-
-Now that all the pieces are there, the last step is to combine them all for something the
-API can use through the [`ModelFilter`] and [`DatasetFilter`] classes (i.e. strings).
-
-
-```python
->>> from huggingface_hub import ModelFilter, DatasetFilter
-
->>> filt = ModelFilter(
-...     task=model_args.pipeline_tag.TextClassification, 
-...     trained_dataset=dataset_args.dataset_name.glue, 
-...     library=model_args.library.PyTorch
-... )
->>> api.list_models(filter=filt)[0]
-ModelInfo: {
-	modelId: Jiva/xlm-roberta-large-it-mnli
-	sha: c6e64469ec4aa17fedbd1b2522256f90a90b5b86
-	lastModified: 2021-12-10T14:56:38.000Z
-	tags: ['pytorch', 'xlm-roberta', 'text-classification', 'it', 'dataset:multi_nli', 'dataset:glue', 'arxiv:1911.02116', 'transformers', 'tensorflow', 'license:mit', 'zero-shot-classification']
-	pipeline_tag: zero-shot-classification
-	siblings: [ModelFile(rfilename='.gitattributes'), ModelFile(rfilename='README.md'), ModelFile(rfilename='config.json'), ModelFile(rfilename='pytorch_model.bin'), ModelFile(rfilename='sentencepiece.bpe.model'), ModelFile(rfilename='special_tokens_map.json'), ModelFile(rfilename='tokenizer.json'), ModelFile(rfilename='tokenizer_config.json')]
-	config: None
-	id: Jiva/xlm-roberta-large-it-mnli
-	private: False
-	downloads: 11061
-	library_name: transformers
-	likes: 1
-}
-```
-
-As you can see, it found the models that fit all the criteria. You can even take it further
-by passing in an array for each of the parameters from before. For example, let's take a look
-for the same configuration, but also include `TensorFlow` in the filter:
-
-
-```python
->>> filt = ModelFilter(
-...     task=model_args.pipeline_tag.TextClassification, 
-...     library=[model_args.library.PyTorch, model_args.library.TensorFlow]
-... )
->>> api.list_models(filter=filt)[0]
-ModelInfo: {
-	modelId: distilbert-base-uncased-finetuned-sst-2-english
-	sha: ada5cc01a40ea664f0a490d0b5f88c97ab460470
-	lastModified: 2022-03-22T19:47:08.000Z
-	tags: ['pytorch', 'tf', 'rust', 'distilbert', 'text-classification', 'en', 'dataset:sst-2', 'transformers', 'license:apache-2.0', 'infinity_compatible']
-	pipeline_tag: text-classification
-	siblings: [ModelFile(rfilename='.gitattributes'), ModelFile(rfilename='README.md'), ModelFile(rfilename='config.json'), ModelFile(rfilename='map.jpeg'), ModelFile(rfilename='pytorch_model.bin'), ModelFile(rfilename='rust_model.ot'), ModelFile(rfilename='tf_model.h5'), ModelFile(rfilename='tokenizer_config.json'), ModelFile(rfilename='vocab.txt')]
-	config: None
-	id: distilbert-base-uncased-finetuned-sst-2-english
-	private: False
-	downloads: 3917525
-	library_name: transformers
-	likes: 49
-}
-```
-
-This query is strictly equivalent to:
-
-```py
->>> filt = ModelFilter(
-...     task="text-classification", 
-...     library=["pytorch", "tensorflow"],
-... )
-```
-
-Here, the [`ModelSearchArguments`] has been a helper to explore the options available on the Hub.
-However, it is not a requirement to make a search. Another way to do that is to visit the
-[models](https://huggingface.co/models) and [datasets](https://huggingface.co/datasets) pages
+To explore available filter on the Hub, visit [models](https://huggingface.co/models) and [datasets](https://huggingface.co/datasets) pages
 in your browser, search for some parameters and look at the values in the URL.
 
diff --git a/docs/source/en/package_reference/hf_api.md b/docs/source/en/package_reference/hf_api.md
@@ -114,10 +114,3 @@ Some helpers to filter repositories on the Hub are available in the `huggingface
 
 [[autodoc]] ModelFilter
 
-### DatasetSearchArguments
-
-[[autodoc]] DatasetSearchArguments
-
-### ModelSearchArguments
-
-[[autodoc]] ModelSearchArguments
diff --git a/src/huggingface_hub/__init__.py b/src/huggingface_hub/__init__.py
@@ -136,12 +136,10 @@
         "CommitOperationAdd",
         "CommitOperationCopy",
         "CommitOperationDelete",
-        "DatasetSearchArguments",
         "GitCommitInfo",
         "GitRefInfo",
         "GitRefs",
         "HfApi",
-        "ModelSearchArguments",
         "RepoUrl",
         "User",
         "UserLikes",
@@ -457,12 +455,10 @@ def __dir__():
         CommitOperationAdd,  # noqa: F401
         CommitOperationCopy,  # noqa: F401
         CommitOperationDelete,  # noqa: F401
-        DatasetSearchArguments,  # noqa: F401
         GitCommitInfo,  # noqa: F401
         GitRefInfo,  # noqa: F401
         GitRefs,  # noqa: F401
         HfApi,  # noqa: F401
-        ModelSearchArguments,  # noqa: F401
         RepoUrl,  # noqa: F401
         User,  # noqa: F401
         UserLikes,  # noqa: F401