Skip to content

Commit

Permalink
feat(client): adapt openAPI "getStatistics"
Browse files Browse the repository at this point in the history
  • Loading branch information
graczhual committed Sep 27, 2021
1 parent 3072ee3 commit 44eb6f1
Show file tree
Hide file tree
Showing 9 changed files with 255 additions and 28 deletions.
1 change: 1 addition & 0 deletions docs/source/api/client/client_module.rst
Original file line number Diff line number Diff line change
Expand Up @@ -16,3 +16,4 @@ tensorbay.client
version
diff
profile
statistics
6 changes: 6 additions & 0 deletions docs/source/api/client/statistics.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
tensorbay.client.statistics
===========================

.. automodule:: tensorbay.client.statistics
:members:
:show-inheritance:
65 changes: 65 additions & 0 deletions docs/source/examples/get_label_statistics.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,65 @@
######################
Get Label Statistics
######################

This topic describes the get label statistics operation.

Label statistics of dataset could be obtained via :func:`~tensorbay.client.dataset.DatasetClientBase.get_label_statistics`
as follows:

>>> from tensorbay import GAS
>>> ACCESS_KEY = "Accesskey-*****"
>>> gas = GAS(ACCESS_KEY)
>>> dataset_client = gas.get_dataset("targetDataset")
>>> statistics = dataset_client.get_label_statistics()
>>> statistics
Statistics {
'BOX2D': {...},
'BOX3D': {...},
'KEYPOINTS2D': {...}
}

The details of the statistics structure for the targetDataset are as follows:

.. code-block:: json
{
"BOX2D": {
"quantity": 1508722,
"categories": [
{
"name": "vehicle.bike",
"quantity": 8425,
"attributes": [
{
"name": "trafficLightColor",
"enum": ["none", "red", "yellow"],
"quantities": [8420, 3, 2]
}
]
}
],
"attributes": [
{
"name": "trafficLightColor",
"enum": ["none", "red", "yellow", "green"],
"quantities": [1356224, 54481, 4107, 93910]
}
]
},
"BOX3D": {
"quantity": 1234
},
"KEYPOINTS2D":{
"quantity": 43234,
"categories":[
{
"name": "person.person",
"quantity": 43234
}
]
}
}
.. note::
The method :func:`~tensorbay.client.statistics.Statistics.dumps` of :class:`~tensorbay.client.statistics.Statistics` can dump the statistics into a dict.
9 changes: 9 additions & 0 deletions docs/source/features/dataset_management.rst
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@ This topic describes dataset management, including:
- :ref:`features/dataset_management:Update Dataset`
- :ref:`features/dataset_management:Move and Copy`
- :ref:`features/dataset_management:Merge Datasets`
- :ref:`features/dataset_management:Get Label Statistics`


******************
Expand Down Expand Up @@ -120,3 +121,11 @@ Please see :ref:`Move and copy<examples/move_and_copy:Move And Copy>` example fo
Since TensorBay supports copy operation between different datasets, users can use it to merge datasets.

Please see :ref:`examples/merge_datasets:Merge Datasets` example for more details.

**********************
Get Label Statistics
**********************

TensorBay supports getting label statistics of dataset.

Please see :ref:`examples/get_label_statistics:Get Label Statistics` example for more details.
58 changes: 30 additions & 28 deletions docs/source/quick_start/examples.rst
Original file line number Diff line number Diff line change
Expand Up @@ -10,35 +10,36 @@ The following table lists a series of examples to help developers to use TensorB
:align: center
:widths: auto

======================================================== =========================================================================
Examples Description
======================================================== =========================================================================
:ref:`examples/DogsVsCats:Dogs vs Cats` | Topic: Dataset Management
| Data Type: Image
| Label Type: :ref:`reference/label_format/Classification:Classification`
:ref:`examples/Newsgroups20:20 Newsgroups` | Topic: Dataset Management
| Data Type: Text
| Label Type: :ref:`reference/label_format/Classification:Classification`
:ref:`examples/BSTLD:BSTLD` | Topic: Dataset Management
| Data Type: Image
| Label Type: :ref:`reference/label_format/Box2D:Box2D`
:ref:`examples/NeolixOD:Neolix OD` | Topic: Dataset Management
| Data Type: Point Cloud
| Label Type: :ref:`reference/label_format/Box3D:Box3D`
:ref:`examples/LeedsSportsPose:Leeds Sports Pose` | Topic: Dataset Management
| Data Type: Image
| Label Type: :ref:`reference/label_format/Keypoints2D:Keypoints2D`
:ref:`examples/THCHS30:THCHS-30` | Topic: Dataset Management
| Data Type: Audio
| Label Type: :ref:`reference/label_format/Sentence:Sentence`
:ref:`examples/VOC2012Segmentation:VOC2012 Segmentation` | Topic: Dataset Management
| Data Type: Image
| Label Types: :ref:`reference/label_format/SemanticMask:SemanticMask`,
:ref:`reference/label_format/InstanceMask:InstanceMask`
:ref:`examples/update_dataset:Update Dataset` | Topic: Update Dataset
:ref:`examples/move_and_copy:Move And Copy` | Topic: Move And Copy
========================================================= =========================================================================
Examples Description
========================================================= =========================================================================
:ref:`examples/DogsVsCats:Dogs vs Cats` | Topic: Dataset Management
| Data Type: Image
| Label Type: :ref:`reference/label_format/Classification:Classification`
:ref:`examples/Newsgroups20:20 Newsgroups` | Topic: Dataset Management
| Data Type: Text
| Label Type: :ref:`reference/label_format/Classification:Classification`
:ref:`examples/BSTLD:BSTLD` | Topic: Dataset Management
| Data Type: Image
| Label Type: :ref:`reference/label_format/Box2D:Box2D`
:ref:`examples/NeolixOD:Neolix OD` | Topic: Dataset Management
| Data Type: Point Cloud
| Label Type: :ref:`reference/label_format/Box3D:Box3D`
:ref:`examples/LeedsSportsPose:Leeds Sports Pose` | Topic: Dataset Management
| Data Type: Image
| Label Type: :ref:`reference/label_format/Keypoints2D:Keypoints2D`
:ref:`examples/THCHS30:THCHS-30` | Topic: Dataset Management
| Data Type: Audio
| Label Type: :ref:`reference/label_format/Sentence:Sentence`
:ref:`examples/VOC2012Segmentation:VOC2012 Segmentation` | Topic: Dataset Management
| Data Type: Image
| Label Types: :ref:`reference/label_format/SemanticMask:SemanticMask`,
:ref:`reference/label_format/InstanceMask:InstanceMask`
:ref:`examples/update_dataset:Update Dataset` | Topic: Update Dataset
:ref:`examples/move_and_copy:Move And Copy` | Topic: Move And Copy
:ref:`examples/merge_datasets:Merge Datasets` | Topic: Merge Datasets
======================================================== =========================================================================
:ref:`examples/get_label_statistics:Get Label Statistics` | Topic: Get Label Statistics
========================================================= =========================================================================

.. toctree::
:hidden:
Expand All @@ -54,3 +55,4 @@ The following table lists a series of examples to help developers to use TensorB
../examples/update_dataset
../examples/move_and_copy
../examples/merge_datasets
../examples/get_label_statistics
15 changes: 15 additions & 0 deletions tensorbay/client/dataset.py
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,7 @@
from .log import UPLOAD_SEGMENT_RESUME_TEMPLATE
from .requests import Tqdm, multithread_upload
from .segment import _STRATEGIES, FusionSegmentClient, SegmentClient
from .statistics import Statistics
from .status import Status
from .version import VersionControlClient

Expand Down Expand Up @@ -278,6 +279,20 @@ def delete_segment(self, name: str) -> None:

self._client.open_api_do("DELETE", "segments", self._dataset_id, json=delete_data)

def get_label_statistics(self) -> Statistics:
"""Get label statistics of the dataset.
Returns:
Required :class:`~tensorbay.client.dataset.Statistics`.
"""
params: Dict[str, Any] = self._status.get_status_info()
return Statistics(
self._client.open_api_do(
"GET", "labels/statistics", self._dataset_id, params=params
).json()["labelStatistics"]
)


class DatasetClient(DatasetClientBase):
"""This class defines :class:`DatasetClient`.
Expand Down
68 changes: 68 additions & 0 deletions tensorbay/client/statistics.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,68 @@
#!/usr/bin/env python3
#
# Copyright 2021 Graviti. Licensed under MIT License.
#

"""Class Statistics.
:class:`Statistics` defines the basic structure of the label statistics obtained by
:meth:`DatasetClientBase.get_label_statistics`.
"""
from typing import Any, Dict

from ..utility import UserMapping


class Statistics(UserMapping[str, Any]): # pylint: disable=too-many-ancestors
"""This class defines the basic structure of the label statistics.
Arguments:
data: The dict containing label statistics.
"""

def __init__(self, data: Dict[str, Any]) -> None:
self._data: Dict[str, Any] = data

def dumps(self) -> Dict[str, Any]:
"""Dumps the label statistics into a dict.
Returns:
A dict containing all the information of the label statistics.
Examples:
>>> label_statistics = Statistics(
... {
... 'BOX3D': {
... 'quantity': 1234
... },
... 'KEYPOINTS2D': {
... 'quantity': 43234,
... 'categories': [
... {
... 'name': 'person.person',
... 'quantity': 43234
... }
... ]
... }
... }
... )
>>> label_statistics.dumps()
... {
... 'BOX3D': {
... 'quantity': 1234
... },
... 'KEYPOINTS2D': {
... 'quantity': 43234,
... 'categories': [
... {
... 'name': 'person.person',
... 'quantity': 43234
... }
... ]
... }
... }
"""
return self._data
33 changes: 33 additions & 0 deletions tensorbay/client/tests/test_dataset.py
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@
from ..lazy import ReturnGenerator
from ..requests import Tqdm
from ..segment import FusionSegmentClient, SegmentClient
from ..statistics import Statistics
from ..status import Status
from ..struct import ROOT_COMMIT_ID
from .utility import mock_response
Expand Down Expand Up @@ -247,6 +248,38 @@ def test_delete_segment(self, mocker):
"DELETE", "segments", self.dataset_client._dataset_id, json=delete_data
)

def test_get_label_statistics(self, mocker):
params = self.dataset_client._status.get_status_info()
response_data = {
"labelStatistics": {
"BOX2D": {
"quantity": 10,
"categories": [
{
"name": "vehicles.bike",
"quantity": 10,
"attributes": [
{
"name": "trafficLightColor",
"enum": ["none", "red", "yellow"],
"quantities": [5, 3, 2],
}
],
}
],
}
}
}
open_api_do = mocker.patch(
f"{gas.__name__}.Client.open_api_do",
return_value=mock_response(data=response_data),
)
statistics1 = self.dataset_client.get_label_statistics()
open_api_do.assert_called_once_with(
"GET", "labels/statistics", self.dataset_client.dataset_id, params=params
)
assert statistics1 == Statistics(response_data["labelStatistics"])


class TestDatasetClient(TestDatasetClientBase):
def test__generate_segments(self, mocker):
Expand Down
28 changes: 28 additions & 0 deletions tests/test_upload.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,11 +3,14 @@
# Copyright 2021 Graviti. Licensed under MIT License.
#

import enum

import pytest
import ulid

from tensorbay import GAS
from tensorbay.client.gas import DEFAULT_BRANCH
from tensorbay.client.statistics import Statistics
from tensorbay.dataset import Data, Dataset, Frame, FusionSegment, Segment
from tensorbay.exception import FrameError, ResourceNotExistError, ResponseError
from tensorbay.label import Catalog, Label
Expand Down Expand Up @@ -75,6 +78,28 @@
]
}

STATISTICS = {
"BOX2D": {
"quantity": 10,
"categories": [
{
"name": "01",
"quantity": 10,
"attributes": [
{"name": "Vertical angle", "enum": [-90], "quantities": [10]},
{"name": "Horizontal angle", "enum": [60], "quantities": [10]},
{"name": "Serie", "enum": [1], "quantities": [10]},
],
}
],
"attributes": [
{"name": "Vertical angle", "enum": [-90], "quantities": [10]},
{"name": "Horizontal angle", "enum": [60], "quantities": [10]},
{"name": "Serie", "enum": [1], "quantities": [10]},
],
}
}


class TestUploadDataset:
def test_upload_dataset_only_with_file(self, accesskey, url, tmp_path):
Expand Down Expand Up @@ -107,6 +132,7 @@ def test_upload_dataset_only_with_file(self, accesskey, url, tmp_path):

gas_client.delete_dataset(dataset_name)

@pytest.mark.xfail(reason="backend statistics are wrong")
def test_upload_dataset_with_label(self, accesskey, url, tmp_path):
gas_client = GAS(access_key=accesskey, url=url)
dataset_name = get_dataset_name()
Expand All @@ -133,6 +159,8 @@ def test_upload_dataset_with_label(self, accesskey, url, tmp_path):
assert segment1[0].path == "hello0.txt"
assert segment1[0].label

statistics1 = dataset_client.get_label_statistics()
assert statistics1 == Statistics(STATISTICS)
gas_client.delete_dataset(dataset_name)

def test_upload_dataset_to_given_branch(self, accesskey, url, tmp_path):
Expand Down

0 comments on commit 44eb6f1

Please sign in to comment.