Skip to content

Commit

Permalink
Introduce Tag Listing (#537)
Browse files Browse the repository at this point in the history
* Initial attempt at a tag getter

* Clean + format

* Add delattr similar to getattr, start tests

* Add some more tests (more to come)

* Increase verbosity

* Add quality

* Improve docstring

* Style

* Finish testing

* Clean + docstring verbosity

* Fix up tests

* Adjust for periods and numbers, improve tests

* Fixup documentation + final changes

* Table error

* Update docs/hub/endpoints.md

Co-authored-by: Omar Sanseviero <osanseviero@users.noreply.github.com>

* Expand docstrings, fix edgecase

* Style

* Fix tests

* Fix tests, finalize implementation

* Bring back some tests

* Update some_text.txt

* Update some_text.txt

* Update some_text.txt

* Fix

* Test readability

Co-authored-by: Omar Sanseviero <osanseviero@users.noreply.github.com>
  • Loading branch information
muellerzr and osanseviero authored Dec 20, 2021
1 parent 1e36d73 commit 3d814e0
Show file tree
Hide file tree
Showing 6 changed files with 346 additions and 1 deletion.
4 changes: 3 additions & 1 deletion docs/hub/endpoints.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,15 +14,17 @@ We have open endpoints that you can use to retrieve information from the Hub as
|------------------------------------------------------------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----------------------------------------------------|
| /api/models GET | Get information from all models in the Hub. You can specify additional parameters to have more specific results. - `search`: Filter based on substrings for repos and their usernames, such as `resnet` or `microsoft` - `author`: Filter models by an author or organization, such as `huggingface` or `microsoft` - `filter`: Filter based on tags, such as `text-classification` or `spacy`. - `sort`: Property to use when sorting, such as `downloads` or `author`. - `direction`: Direction in which to sort, such as `-1` for descending, and anything else for ascending. - `limit`: Limit the number of models fetched. - `full`: Whether to fetch most model data, such as all tags, the files, etc. - `config`: Whether to also fetch the repo config. | `list_models()` | ```params= { "search":"search", "author":"author", "filter":"filter", "sort":"sort", "direction":"direction", "limit":"limit", "full":"full", "config":"config"}``` | |
| /api/models/{repo_id} /api/models/{repo_id}/revision/{revision} GET | Get all information for a specific model. | `model_info(repo_id, revision)` | ```headers = { "authorization" : "Bearer $token" }``` | |
| /api/models-tags-by-type GET | Gets all the available model tags hosted in the Hub | `get_model_tags()` | | |
| /api/datasets GET | Get information from all datasets in the Hub. You can specify additional parameters to have more specific results. - `search`: Filter based on substrings for repos and their usernames, such as `pets` or `microsoft` - `author`: Filter datasets by an other or organization, such as `huggingface` or `microsoft` - `filter`: Filter based on tags, such as `task_categories:text-classification` or `languages:en`. - `sort`: Property to use when sorting, such as `downloads` or `author`. - `direction`: Direction in which to sort, such as `-1` for descending, and anything else for ascending. - `limit`: Limit the number of datasets fetched. - `full`: Whether to fetch most dataset data, such as all tags, the files, etc. | `list_datasets()` | ```params= { "search":"search", "author":"author", "filter":"filter", "sort":"sort", "direction":"direction", "limit":"limit", "full":"full", "config":"config"}``` | |
| /api/datasets/{repo_id} /api/datasets/{repo_id}/revision/{revision} GET | Get all information for a specific dataset. - `full`: Whether to fetch most dataset data, such as all tags, the files, etc. | `dataset_info(repo_id, revision)` | ```headers = { "authorization" : "Bearer $token", "full" : "full" }``` | |
| /api/datasets-tags-by-type GET | Gets all the available dataset tags hosted in the Hub | `get_dataset_tags()` | | |
| /api/metrics GET | Get information from all metrics in the Hub. | `list_metrics()` | | |
| /api/repos/ls GET ⚠️ deprecated | Get list of all stored files for user or organization. | `list_repos_objs(token, organization)` | ```headers = { "authorization" : "Bearer $token" }``` ```params= { "organization":"organization"}``` | |
| /api/repos/create POST | Create a repository. It's a model repo by default. - type: Type of repo (datasets or spaces; model by default). - name: Name of repo. - organization: Name of organization. - - private: Whether the repo is private. | `create_repo()` | ```headers = { authorization : "Bearer $token" }``` ```json= {"type":"type", "name":"name", "organization":"organization", "private":"private"}``` | |
| /api/repos/delete DELETE | Delete a repository. It's a model repo by default. - type: Type of repo (datasets or spaces; model by default). - name: Name of repo. - organization: Name of organization. | `delete_repo()` | ```headers = { "authorization" : "Bearer $token" }``` ```json= {"type":"type", "name":"name", "organization":"organization"}``` | |
| /api/repos/{type}/{repo_id}/settings PUT | Update repo visibility. | `update_repo_visibility()` | ```headers = { "authorization" : "Bearer $token" }``` ```json= {"private":"private"}``` | |
| /api/{type}/{repo_id}/ upload/{revision}/{path_in_repo} POST | Upload a file to a specific repository. | `upload_file()` | ```headers = { "authorization" : "Bearer $token" }``` ```"data"="bytestream"``` | |
| /api/login POST | Login user and obtain authentication token. | `login(username, password)` | ```json = { "username" : "username", "password": "password" }```
| /api/login POST | Login user and obtain authentication token. | `login(username, password)` | ```json = { "username" : "username", "password": "password" }```
| /api/whoami GET | Get username and organizations the user belongs to. | `whoami(token)` | ```headers = { "authorization" : "Bearer $token" }``` | |
| /api/logout POST | Log out user. | `logout(token)` | ```headers = { "authorization" : "Bearer $token" }``` | |

Expand Down
2 changes: 2 additions & 0 deletions src/huggingface_hub/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -121,9 +121,11 @@ With the `HfApi` class there are methods to query models, datasets, and metrics
- **Models**:
- `list_models()`
- `model_info()`
- `get_model_tags()`
- **Datasets**:
- `list_datasets()`
- `dataset_info()`
- `get_dataset_tags()`

These lightly wrap around the API Endpoints. Documentation for valid parameters and descriptions can be found [here](https://huggingface.co/docs/hub/endpoints).

Expand Down
2 changes: 2 additions & 0 deletions src/huggingface_hub/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,9 @@
dataset_info,
delete_file,
delete_repo,
get_dataset_tags,
get_full_repo_name,
get_model_tags,
list_datasets,
list_metrics,
list_models,
Expand Down
20 changes: 20 additions & 0 deletions src/huggingface_hub/hf_api.py
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,7 @@
REPO_TYPES_URL_PREFIXES,
SPACES_SDK_TYPES,
)
from .utils.tags import DatasetTags, ModelTags


if sys.version_info >= (3, 8):
Expand Down Expand Up @@ -417,6 +418,22 @@ def set_access_token(access_token: str):
def unset_access_token():
erase_from_credential_store(USERNAME_PLACEHOLDER)

def get_model_tags(self) -> ModelTags:
"Gets all valid model tags as a nested namespace object"
path = f"{self.endpoint}/api/models-tags-by-type"
r = requests.get(path)
r.raise_for_status()
d = r.json()
return ModelTags(d)

def get_dataset_tags(self) -> DatasetTags:
"Gets all valid dataset tags as a nested namespace object"
path = f"{self.endpoint}/api/datasets-tags-by-type"
r = requests.get(path)
r.raise_for_status()
d = r.json()
return DatasetTags(d)

def list_models(
self,
filter: Union[str, Iterable[str], None] = None,
Expand Down Expand Up @@ -1154,6 +1171,9 @@ def delete_token(cls):

list_metrics = api.list_metrics

get_model_tags = api.get_model_tags
get_dataset_tags = api.get_dataset_tags

create_repo = api.create_repo
delete_repo = api.delete_repo
update_repo_visibility = api.update_repo_visibility
Expand Down
Loading

0 comments on commit 3d814e0

Please sign in to comment.