Contribution Guide

We appreciate any contribution to Datumaro, whether it's in the form of a Pull Request, Feature Request or general comments/issue that you found. For feature requests and issues, please feel free to create a GitHub Issue in this repository.

Related sections

Design document
Developer manual

Development and pull requests

Prerequisites

Python (3.9+)

To set up your development environment, please follow the steps below.

Because Datumaro has some C++ and Rust implementations to improve Python performance, you should install C++ compiler (apt-get install build-essential) and a Rust toolchain in your system to build the binary extensions.
Fork the repo.
clone the forked repo.
```
git clone <forked_repo>
```

Optionally, install a virtual environment (recommended):

python -m pip install virtualenv
python -m virtualenv venv
. venv/bin/activate

Install Datumaro with optional dependencies:

cd /path/to/the/cloned/repo/
pip install -e .[tf,tfds,torch,default]

Install dev & test dependencies:

pip install -r requirements-dev.txt
pip install -r tests/requirements.txt

Set up pre-commit hooks in the repo. See Code style.
```
pre-commit install
pre-commit run
```
Create your branch based off the develop branch and make changes.
Verify your code by running unit tests and integration tests. See Testing
```
pytest -v
```
or
```
python -m pytest -v
```
Push your changes.

Now you are ready to create a PR(Pull Request) and get review.

Optional dependencies

Developer should install the following optional components for running our tests:

OpenVINO
Accuracy Checker
TensorFlow
PyTorch
MxNet
Caffe

Usage

datum --help
python -m datumaro --help
python datumaro/ --help
python datum.py --help

import datumaro

Code style

Try to be readable and consistent with the existing codebase.

The project uses Black for code formatting and isort for sorting import statements. You can find corresponding configurations in pyproject.toml in the repository root. No trailing whitespaces, at most 100 characters per line.

Datumaro includes a Git pre-commit hook, .pre-commit-config.yaml that can help you follow the style requirements. To install, make sure isort and black are installed on your system, then run pre-commit run.

Environment

The recommended editor is VS Code with the Python language plugin.

Testing

It is expected that all Datumaro functionality is covered and checked by unit tests. Tests are placed in the tests/unit/ directory. Additional pre-generated files for tests can be stored in the tests/assets/ directory. CLI tests are separated from the core tests, they are stored in the tests/integration/cli/ directory.

Currently, we use pytest for testing.

To run tests use:

pytest -v

or

python -m pytest -v

Test cases

Test marking

For better integration with CI and requirements tracking, we use special annotations for tests.

A test needs to linked with a requirement it is related to. To link a test, use:

from unittest import TestCase
from .requirements import Requirements, mark_requirement

class MyTests(TestCase):
    @mark_requirement(Requirements.DATUM_GENERAL_REQ)
    def test_my_requirement(self):
        ... do stuff ...

Such marking will apply markings from the requirement specified. They can be overridden for a specific test:

import pytest

class MyTests(TestCase):
    @pytest.mark.priority_low
    @mark_requirement(Requirements.DATUM_GENERAL_REQ)
    def test_my_requirement(self):
        ... do stuff ...

Requirements

Requirements and other links need to be added to tests/requirements.py:

DATUM_244 = "Add Snyk integration"
DATUM_BUG_219 = "Return format is not uniform"

# Fully defined in GitHub issues:
@pytest.mark.reqids(Requirements.DATUM_244, Requirements.DATUM_333)

# And defined any other way:
@pytest.mark.reqids(Requirements.DATUM_GENERAL_REQ)

Available annotations for tests and requirements

Markings are defined in tests/conftest.py.

A list of requirements and bugs

@pytest.mark.requids(Requirements.DATUM_123)
@pytest.mark.bugs(Requirements.DATUM_BUG_456)

A priority

@pytest.mark.priority_low
@pytest.mark.priority_medium
@pytest.mark.priority_high

Component The marking used for indication of different system components

@pytest.mark.components(DatumaroComponent.Datumaro)

Skipping tests

@pytest.mark.skip(SkipMessages.NOT_IMPLEMENTED)

Parametrized runs

Parameters are used for running the same test with different parameters e.g.

@pytest.mark.parametrize("numpy_array, batch_size", [
    (np.zeros([2]), 0),
    (np.zeros([2]), 1),
    (np.zeros([2]), 2),
    (np.zeros([2]), 5),
    (np.zeros([5]), 2),
])

Test documentation

Tests are documented with docs strings. Test descriptions must contain the following: sections: Description, Expected results and Steps.

def test_can_convert_polygons_to_mask(self):
    """
    <b>Description:</b>
    Ensure that the dataset polygon annotation can be properly converted
    into dataset segmentation mask.

    <b>Expected results:</b>
    Dataset segmentation mask converted from dataset polygon annotation
    is equal to an expected mask.

    <b>Steps:</b>
    1. Prepare dataset with polygon annotation
    2. Prepare dataset with expected mask segmentation mode
    3. Convert source dataset to target, with conversion of annotation
      from polygon to mask.
    4. Verify that resulting segmentation mask is equal to the expected mask.
    """

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

contributing.md

contributing.md

Contribution Guide

Related sections

Development and pull requests

Prerequisites

Optional dependencies

Usage

Code style

Environment

Testing

Test cases

Test marking

Requirements

Available annotations for tests and requirements

Test documentation

Files

contributing.md

Latest commit

History

contributing.md

File metadata and controls

Contribution Guide

Related sections

Development and pull requests

Prerequisites

Optional dependencies

Usage

Code style

Environment

Testing

Test cases

Test marking

Requirements

Available annotations for tests and requirements

Test documentation