Skip to content

Commit

Permalink
[Datumaro] CLI updates + better documentation (#1057)
Browse files Browse the repository at this point in the history
  • Loading branch information
zhiltsov-max authored and nmanovic committed Jan 27, 2020
1 parent 095d6d4 commit 93b3c09
Show file tree
Hide file tree
Showing 73 changed files with 2,461 additions and 1,678 deletions.
8 changes: 7 additions & 1 deletion .vscode/settings.json
Original file line number Diff line number Diff line change
Expand Up @@ -25,5 +25,11 @@
}
],
"python.linting.pylintEnabled": true,
"python.envFile": "${workspaceFolder}/.vscode/python.env"
"python.envFile": "${workspaceFolder}/.vscode/python.env",
"python.testing.unittestEnabled": true,
"python.testing.unittestArgs": [
"-v",
"-s",
"./datumaro",
],
}
3 changes: 3 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@ CVAT is free, online, interactive video and image annotation tool for computer v
- [Installation guide](cvat/apps/documentation/installation.md)
- [User's guide](cvat/apps/documentation/user_guide.md)
- [Django REST API documentation](#rest-api)
- [Datumaro dataset framework](datumaro/README.md)
- [Command line interface](utils/cli/)
- [XML annotation format](cvat/apps/documentation/xml_format.md)
- [AWS Deployment Guide](cvat/apps/documentation/AWS-Deployment-Guide.md)
Expand All @@ -34,6 +35,8 @@ CVAT is free, online, interactive video and image annotation tool for computer v
## Supported annotation formats

Format selection is possible after clicking on the Upload annotation / Dump annotation button.
[Datumaro](datumaro/README.md) dataset framework allows additional dataset transformations
via its command line tool.

| Annotation format | Dumper | Loader |
| ---------------------------------------------------------------------------------- | ------ | ------ |
Expand Down
34 changes: 26 additions & 8 deletions cvat/apps/dataset_manager/bindings.py
Original file line number Diff line number Diff line change
@@ -1,3 +1,8 @@

# Copyright (C) 2019-2020 Intel Corporation
#
# SPDX-License-Identifier: MIT

from collections import OrderedDict
import os
import os.path as osp
Expand All @@ -6,7 +11,7 @@

from cvat.apps.annotation.annotation import Annotation
from cvat.apps.engine.annotation import TaskAnnotation
from cvat.apps.engine.models import Task, ShapeType
from cvat.apps.engine.models import Task, ShapeType, AttributeType

import datumaro.components.extractor as datumaro
from datumaro.util.image import lazy_image
Expand Down Expand Up @@ -128,18 +133,33 @@ def _read_cvat_anno(self, cvat_anno):
attrs = {}
db_attributes = db_label.attributespec_set.all()
for db_attr in db_attributes:
attrs[db_attr.name] = db_attr.default_value
attrs[db_attr.name] = db_attr
label_attrs[db_label.name] = attrs
map_label = lambda label_db_name: label_map[label_db_name]

def convert_attrs(label, cvat_attrs):
cvat_attrs = {a.name: a.value for a in cvat_attrs}
dm_attr = dict()
for attr_name, attr_spec in label_attrs[label].items():
attr_value = cvat_attrs.get(attr_name, attr_spec.default_value)
try:
if attr_spec.input_type == AttributeType.NUMBER:
attr_value = float(attr_value)
elif attr_spec.input_type == AttributeType.CHECKBOX:
attr_value = attr_value.lower() == 'true'
dm_attr[attr_name] = attr_value
except Exception as e:
slogger.task[self._db_task.id].error(
"Failed to convert attribute '%s'='%s': %s" % \
(attr_name, attr_value, e))
return dm_attr

for tag_obj in cvat_anno.tags:
anno_group = tag_obj.group
if isinstance(anno_group, int):
anno_group = anno_group
anno_label = map_label(tag_obj.label)
anno_attr = dict(label_attrs[tag_obj.label])
for attr in tag_obj.attributes:
anno_attr[attr.name] = attr.value
anno_attr = convert_attrs(tag_obj.label, tag_obj.attributes)

anno = datumaro.LabelObject(label=anno_label,
attributes=anno_attr, group=anno_group)
Expand All @@ -150,9 +170,7 @@ def _read_cvat_anno(self, cvat_anno):
if isinstance(anno_group, int):
anno_group = anno_group
anno_label = map_label(shape_obj.label)
anno_attr = dict(label_attrs[shape_obj.label])
for attr in shape_obj.attributes:
anno_attr[attr.name] = attr.value
anno_attr = convert_attrs(shape_obj.label, shape_obj.attributes)

anno_points = shape_obj.points
if shape_obj.type == ShapeType.POINTS:
Expand Down
8 changes: 3 additions & 5 deletions cvat/apps/dataset_manager/export_templates/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,17 +6,15 @@ python -m virtualenv .venv
. .venv/bin/activate

# install dependencies
sed -r "s/^(.*)#.*$/\1/g" datumaro/requirements.txt | xargs -n 1 -L 1 pip install
pip install -e datumaro/
pip install -r cvat/utils/cli/requirements.txt

# set up environment
PYTHONPATH=':'
export PYTHONPATH
ln -s $PWD/datumaro/datum.py ./datum
chmod a+x datum

# use Datumaro
./datum --help
datum --help
```

Check Datumaro [QUICKSTART.md](datumaro/docs/quickstart.md) for further info.
Check Datumaro [docs](datumaro/README.md) for more info.
Original file line number Diff line number Diff line change
@@ -1,3 +1,8 @@

# Copyright (C) 2019-2020 Intel Corporation
#
# SPDX-License-Identifier: MIT

from collections import OrderedDict
import getpass
import json
Expand Down Expand Up @@ -27,7 +32,7 @@ class cvat_rest_api_task_images(datumaro.Extractor):
def _image_local_path(self, item_id):
task_id = self._config.task_id
return osp.join(self._cache_dir,
'task_{}_frame_{:06d}.jpg'.format(task_id, item_id))
'task_{}_frame_{:06d}.jpg'.format(task_id, int(item_id)))

def _make_image_loader(self, item_id):
return lazy_image(item_id,
Expand Down
12 changes: 9 additions & 3 deletions cvat/apps/dataset_manager/task.py
Original file line number Diff line number Diff line change
@@ -1,3 +1,8 @@

# Copyright (C) 2019-2020 Intel Corporation
#
# SPDX-License-Identifier: MIT

from datetime import timedelta
import json
import os
Expand Down Expand Up @@ -217,8 +222,9 @@ def export(self, dst_format, save_dir, save_images=False, server_url=None):
if dst_format == EXPORT_FORMAT_DATUMARO_PROJECT:
self._remote_export(save_dir=save_dir, server_url=server_url)
else:
self._dataset.export_project(output_format=dst_format,
save_dir=save_dir, save_images=save_images)
converter = self._dataset.env.make_converter(dst_format,
save_images=save_images)
self._dataset.export_project(converter=converter, save_dir=save_dir)

def _remote_image_converter(self, save_dir, server_url=None):
os.makedirs(save_dir, exist_ok=True)
Expand Down Expand Up @@ -246,7 +252,7 @@ def _remote_image_converter(self, save_dir, server_url=None):
if db_video is not None:
for i in range(self._db_task.size):
frame_info = {
'id': str(i),
'id': i,
'width': db_video.width,
'height': db_video.height,
}
Expand Down
5 changes: 5 additions & 0 deletions cvat/apps/dataset_manager/util.py
Original file line number Diff line number Diff line change
@@ -1,3 +1,8 @@

# Copyright (C) 2019-2020 Intel Corporation
#
# SPDX-License-Identifier: MIT

import inspect
import os, os.path as osp
import zipfile
Expand Down
119 changes: 119 additions & 0 deletions datumaro/CONTRIBUTING.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,119 @@
## Table of Contents

- [Installation](#installation)
- [Usage](#usage)
- [Testing](#testing)
- [Design](#design-and-code-structure)

## Installation

### Prerequisites

- Python (3.5+)
- OpenVINO (optional)

``` bash
git clone https://github.com/opencv/cvat
```

Optionally, install a virtual environment:

``` bash
python -m pip install virtualenv
python -m virtualenv venv
. venv/bin/activate
```

Then install all dependencies:

``` bash
while read -r p; do pip install $p; done < requirements.txt
```

If you're working inside CVAT environment:
``` bash
. .env/bin/activate
while read -r p; do pip install $p; done < datumaro/requirements.txt
```

## Usage

> The directory containing Datumaro should be in the `PYTHONPATH`
> environment variable or `cvat/datumaro/` should be the current directory.
``` bash
datum --help
python -m datumaro --help
python datumaro/ --help
python datum.py --help
```

``` python
import datumaro
```

## Testing

It is expected that all Datumaro functionality is covered and checked by
unit tests. Tests are placed in `tests/` directory.

To run tests use:

``` bash
python -m unittest discover -s tests
```

If you're working inside CVAT environment, you can also use:

``` bash
python manage.py test datumaro/
```

## Design and code structure

- [Design document](docs/design.md)

### Command-line

Use [Docker](https://www.docker.com/) as an example. Basically,
the interface is divided on contexts and single commands.
Contexts are semantically grouped commands,
related to a single topic or target. Single commands are handy shorter
alternatives for the most used commands and also special commands,
which are hard to be put into any specific context.

![cli-design-image](docs/images/cli_design.png)

- The diagram above was created with [FreeMind](http://freemind.sourceforge.net/wiki/index.php/Main_Page)

Model-View-ViewModel (MVVM) UI pattern is used.

![mvvm-image](docs/images/mvvm.png)

### Datumaro project and environment structure

<!--lint disable fenced-code-flag-->
```
├── [datumaro module]
└── [project folder]
├── .datumaro/
| ├── config.yml
│   ├── .git/
│   ├── importers/
│   │   ├── custom_format_importer1.py
│   │   └── ...
│   ├── statistics/
│   │   ├── custom_statistic1.py
│   │   └── ...
│   ├── visualizers/
│   │ ├── custom_visualizer1.py
│   │ └── ...
│   └── extractors/
│   ├── custom_extractor1.py
│   └── ...
├── dataset/
└── sources/
├── source1
└── ...
```
<!--lint enable fenced-code-flag-->
Loading

0 comments on commit 93b3c09

Please sign in to comment.