Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix wrong example of Datumaro dataset creation in document #1195

Merged
merged 4 commits into from
Nov 21, 2023
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -12,6 +12,10 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
- Add ImageColorScale context manager
(<https://github.com/openvinotoolkit/datumaro/pull/1194>)

### Bug fixes
- Fix wrong example of Datumaro dataset creation in document
(<https://github.com/openvinotoolkit/datumaro/pull/1195>)

## 16/11/2023 - Release 1.5.1
### Enhancements
- Enhance Datumaro data format stream importer performance
Original file line number Diff line number Diff line change
@@ -8,7 +8,7 @@ Explorer is a feature that operates on hash basis. Once you put dataset that use

To explore similar data in dataset, you need to set query first. Query could be image, text, list of images, list of texts and list of images and texts. The query does not need to be an image that exists in the dataset. You can put in any data that you want to explore similar dataset. And you need to set top-k that how much you want to find similar data. The default value for top-k is 50, so if you hope to find more smaller results, you would set top-k. For single query, we computed hamming distance of hash between whole dataset and query. And we sorted those distance and select top-k data which have short distance. For list query, we repeated computing distance for each query and select top-k data based on distance among all dataset.

The command can be applied to a dataset. And if you want to use multiple dataset as database, you could use merged dataset. The current project (`-p/--project`) is also used a context for plugins, so it can be useful for dataset paths having custom formats. When not specified, the current project's working tree is used. To save visualized result (`-s/--save`) is turned off as default. This visualized result is based on [Visualizer](../../jupyter_notebook_examples/visualizer).
The command can be applied to a dataset. And if you want to use multiple dataset as database, you could use merged dataset. The current project (`-p/--project`) is also used a context for plugins, so it can be useful for dataset paths having custom formats. When not specified, the current project's working tree is used. To save visualized result (`-s/--save`) is turned off as default. This visualized result is based on [Visualizer](../../jupyter_notebook_examples/notebooks/03_visualize).

Usage:
```console
3 changes: 1 addition & 2 deletions docs/source/docs/data-formats/datumaro_format.md
Original file line number Diff line number Diff line change
@@ -4,8 +4,7 @@ So far, in the field of computer vision, there are various tasks such as classif
and segmentation, as well as pose estimation and visual tracking, and public data is used by providing
a format suitable for each task. Even within the same segmentation task, some data formats provide
annotation information as polygons, while others provide mask form. In order to ensure compatibility
with different tasks and formats, we provide a novel Datumaro format with `.json` ([Datumaro](../explanation/formats/datumaro.md)) or `.datum` ([DatumaroBinary](../explanation/formats/datumaro.md))
extensions.
with different tasks and formats, we provide a novel Datumaro format with `.json` ([Datumaro](./formats/datumaro)) or `.datum` ([DatumaroBinary](./formats/datumaro_binary)) extensions.

A variety of metadata can be stored in the datumaro format. First of all, `dm_format_version` field
is provided for backward compatibility to help with data version tracing and various metadata can be
4 changes: 2 additions & 2 deletions docs/source/docs/get-started/quick-start-guide/examples.rst
Original file line number Diff line number Diff line change
@@ -85,9 +85,9 @@ Examples
import numpy as np
import datumaro as dm

dataset = dm.Dataset([
dataset = dm.Dataset.from_iterable([
dm.DatasetItem(id='image1', subset='train',
image=np.ones((5, 5, 3)),
media=dm.Image.from_numpy(data=np.ones((5, 5, 3))),
annotations=[
dm.Bbox(1, 2, 3, 4, label=0),
]
Original file line number Diff line number Diff line change
@@ -10,7 +10,7 @@ be paid, and sometimes it may be necessary to filter or correct the data in adva
data validation functionality for this purpose.

More detailed descriptions about validation errors and warnings are given by :ref:`here <Validate>`.
The Python example for the usage of validator is described in this `notebook <../../jupyter_notebook_examples/notebooks/11_validate>`_.
The Python example for the usage of validator is described in this :doc:`notebook <../../jupyter_notebook_examples/notebooks/11_validate>`.


.. tab-set::