Skip to content

Commit

Permalink
Add document descriptions for Python APIs (#761)
Browse files Browse the repository at this point in the history
* add document descriptions for python api

* modify CHANGELOG

* remark-lint line length 100

* update CHANGELOG

* correct whitespace
  • Loading branch information
wonjuleee authored Nov 16, 2022
1 parent 48e4a47 commit c5327b3
Show file tree
Hide file tree
Showing 5 changed files with 74 additions and 6 deletions.
4 changes: 3 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,8 +25,10 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
(<https://github.com/openvinotoolkit/datumaro/pull/746>)
- Add Mask, SuperResolution, Depth visualization features
(<https://github.com/openvinotoolkit/datumaro/pull/747>)
- Add a documentation tab menu for Python API
- Add a documentation for Python API
(<https://github.com/openvinotoolkit/datumaro/pull/753>)
- Add dataset handler, visualizer, filter descriptions
(<https://github.com/openvinotoolkit/datumaro/pull/761>)
- Add `__repr__` for Dataset
(<https://github.com/openvinotoolkit/datumaro/pull/750>)
- Support for exporting as CVAT video format
Expand Down
1 change: 1 addition & 0 deletions site/content/en/docs/python-api/_index.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@ Need to update the description.

## Contents
- [Python API examples](./python-api-examples)
- [Dataset Handler](./python-api-examples/dataset-handler)
- [Visualizer](./python-api-examples/visualizer)
- [Filter](./python-api-examples/filter)
- [Transform](./python-api-examples/transform)
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
---
title: 'Dataset handler'
linkTitle: 'dataset-handler'
description: ''
---

Datumaro provides the dataset import and export functionalities.

When importing multiple datasets, Datumaro helps to manipulate and merge them into a single
dataset. Since the manipulations such as reidentification, label redefinition, or filtration are
mostly the topic of transformation, we here describe how to merge two heterogeneous datasets
through `IntersectMerge`.

Jupyter Notebook Examples:
{{< blocks/notebook 01_merge_multiple_datasets_for_classification >}}
{{< blocks/notebook 02_merge_heterogeneous_datasets_for_detection >}}
49 changes: 46 additions & 3 deletions site/content/en/docs/python-api/python-api-examples/filter.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,50 @@ linkTitle: 'filter'
description: ''
---

Need to update the description.
This API allows you to filter a dataset to satisfy some conditions.
Here, XML [XPath](https://devhints.io/xpath) is used as a query format.

Jupyter Notebook Examples:
Need to update the description.
For instance, with a given XML file below, we can filter a dataset by the subset name through
`/item[subset="minival2014"]`, by the media id through `/item[id="290768"]`, by the image sizes
through `/item[image/width=image/height]`, and annotation information such as id (`id`), type
(`type`), label (`label_id`), bounding box (`x, y, w, h`), etc.

``` xml
<item>
<id>290768</id>
<subset>minival2014</subset>
<image>
<width>612</width>
<height>612</height>
<depth>3</depth>
</image>
<annotation>
<id>80154</id>
<type>bbox</type>
<label_id>39</label_id>
<x>264.59</x>
<y>150.25</y>
<w>11.19</w>
<h>42.31</h>
<area>473.87</area>
</annotation>
<annotation>
<id>669839</id>
<type>bbox</type>
<label_id>41</label_id>
<x>163.58</x>
<y>191.75</y>
<w>76.98</w>
<h>73.63</h>
<area>5668.77</area>
</annotation>
...
</item>
```

For the annotation-based filtration, we need to set the argument `filter_annotations` to `True`.
We provide the argument `remove_empty` to remove all media with an empty annotation. We note that
datasets are updated in-place by default.

Jupyter Notebook Example:
{{< blocks/notebook 04_filter >}}
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,13 @@ linkTitle: 'visualizer'
description: ''
---

Need to update the description.
This API allows you to visualize a dataset with media ids.

Jupyter Notebook Examples:
Although Datumaro supports various kinds of vision tasks, e.g., classification, object detection,
semantic segmentation, key point estimation, visual captioning, etc., we provide a task-agnostic
visualization tool. That is, regardless of annotation types, `vis_gallery` describes the multiple
annotation-overlapped images from a list of multiple media ids. We can control the transparency of
annotations over images by adjusting `alpha`.

Jupyter Notebook Example:
{{< blocks/notebook 03_visualize >}}

0 comments on commit c5327b3

Please sign in to comment.