Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add auto-annotation support to SDK and CLI #6483

Merged
merged 13 commits into from
Aug 2, 2023

Conversation

SpecLad
Copy link
Contributor

@SpecLad SpecLad commented Jul 14, 2023

Motivation and context

Introduce a cvat-sdk auto-annotate command that downloads data for a task,
then runs a function on the local computer on that data, and uploads
resulting annotations back to the task.

To support this functionality, add a new SDK module,
cvat_sdk.auto_annotation, that contains an interface that the functions
must follow, and a driver that applies a function to a task.

This will let users easily annotate their tasks with custom DL models.

How has this been tested?

Unit tests, manual tests.

Checklist

  • I submit my changes into the develop branch
  • I have added a description of my changes into the CHANGELOG file
  • I have updated the documentation accordingly
  • I have added tests to cover my changes
  • [ ] I have linked related issues (see GitHub docs)
  • [ ] I have increased versions of npm packages if it is necessary
    (cvat-canvas,
    cvat-core,
    cvat-data and
    cvat-ui)

License

  • I submit my code changes under the same MIT License that covers the project.
    Feel free to contact the maintainers if that's a concern.

@SpecLad SpecLad force-pushed the client-side-lambdas branch 2 times, most recently from 6c1f16b to beb0ca6 Compare July 18, 2023 10:18
metavar='MODULE',
required=True,
dest="function_module",
)
Copy link
Contributor

@zhiltsov-max zhiltsov-max Jul 18, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably, it makes sense to add other parameters:

  • job_id(s)
  • clear_existing / append
  • script file path
  • an extra parameter to for module root directory
  • strict mode, where we fail if the script labels do not exist in the dataset ("label 'car' is not in dataset; any annotations using it will be ignored" - is not something I'd like to see as a user after the model has been running for some time)
  • skip-errors to skip the frame if there is an error

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to discuss this. From one point of view, some of these parameters can be convenient:

  • job_id(s) - probably we should be able to annotate jobs as well. I agree. But I don't think that we need the argument here
  • clear_existing - better to have just a separate command to remove annotation. Otherwise we can get very complicated interface.
  • script file path - OK, instead of a module name
  • strict - agree and maybe it should be a default mode. Thus we avoid many complains, that a function doesn't annotate a task. I met the problem today.
  • skip-errors - I'm not sure we want to skip any errors.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My thoughts:

job_id(s)

I think ideally the CLI should have separate sets of task and job based commands (e.g. cvat-cli task list / cvat-cli job list). IIUC, job auto-annotation won't require you to use any task-based APIs, so it would make more sense to implement it as a job subcommand (cvat-cli job auto-annotate).

However, that would be beyond the scope of this PR.

clear_existing / append

I agree with Maxim that this should be an option. Moreover, the default should probably be "append". Currently, the command clears existing annotations, but that's mainly just because it's easier. Defaulting to append will avoid accidental data loss.

@nmanovic I think we should have a separate command to clear annotations, but it's not very convenient to run a separate command every time you want to reannotate, so I think an option for this command would be useful to have.

script file path

I'm thinking of adding a --function-file option as an alternative to --function (and maybe renaming the latter to --function-module). Instead of importing, this will load the file as a string and compile it. It will require that the file has no relative imports, though.

an extra parameter to for module root directory

I don't know what this means.

strict

Yeah, fair point. I also agree with Nikita that this should be the default.

skip-errors

I get the idea, but then what do you do with frames that were skipped? There's no easy way to find them afterwards, and even if we could find them, we can't rerun the function on just those frames.

We could consider some solution for this, but IMO it goes beyond the current scope.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of importing, this will load the file as a string and compile it. It will require that the file has no relative imports, though.

an extra parameter to for module root directory

I don't know what this means.

I've given an example in the test comments. Basically, it's needed both for relative imports and for fully-qualified imports to specify where to look for the import path (like PYTHONPATH does). Without this, it's hard to use scripts relying on a customized framework (e.g. mmdetection), lying somewhere near the cvat adapter script, or just outside the current directory:

# .../some/dir/mymodel/mmdetection/...
# .../some/dir/mymodel/cvat_adapter.py

cvat-cli auto-annotate --module-root '.../some/dir/' --module 'mymodel.cvat_adapter' ...

skip-errors

I get the idea, but then what do you do with frames that were skipped? There's no easy way to find them afterwards, and even if we could find them, we can't rerun the function on just those frames.

I thought about things like OOM errors on some frames. Not sure how often it happens in real applications.

Actually, the implementation allows execution on specific frames (except for tracks), as the function is executed on separate frames, so it's only a question of API/CLI. I also see an option to execute the model only on some part of the task a useful feature, though the CLI use may be complicated in this case. Probably, we can leave this for future updates.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've given an example in the test comments. Basically, it's needed both for relative imports and for fully-qualified imports to specify where to look for the import path (like PYTHONPATH does).

So why not just use PYTHONPATH?

Actually, the implementation allows execution on specific frames (except for tracks), as the function is executed on separate frames, so it's only a question of API/CLI.

It's true that we can run the function on specific frames, but how do we know which frames to use (in the context of recovery from a partially successful auto-annotation run?

I also see an option to execute the model only on some part of the task a useful feature, though the CLI use may be complicated in this case. Probably, we can leave this for future updates.

I agree.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So why not just use PYTHONPATH?

Probably, it's an implementation detail, and this obliges users to learn environment variables.

how do we know which frames to use (in the context of recovery from a partially successful auto-annotation run?

It could be reported somehow, e.g. in the logs.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added --clear-existing and --allow-unmatched-labels. Correspondingly, I changed the defaults to not erase old annotations, and abort when a function declares labels not configured in the task.

Probably, it's an implementation detail, and this obliges users to learn environment variables.

The user would have to learn something either way, and at least they might already know about PYTHONPATH.

"auto-annotate",
str(fxt_new_task.id),
f"--function={__package__}.example_function",
)
Copy link
Contributor

@zhiltsov-max zhiltsov-max Jul 18, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest adding several other tests for different file placement, because we don't have a package when we call it from the actual CLI (in the tests we cheat a little, in practice we have to use something like PYTHONPATH="$PWD:$PYTHONPATH" cvat-cli ...).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well... what else is there to test? Either way, the auto-annotate command just does an import on the provided module name.

Copy link
Contributor

@zhiltsov-max zhiltsov-max Jul 20, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • A test that we can execute a function from a file/module in the current directory
  • A test that we can execute a function from a file/module outside the current directory subtree, and which is not installed in the current system or python environment (e.g. cvat-cli auto-annotate --function ../../my_cvat_model.py ...
  • A test that we can execute a function from a file/module, that references other modules (e.g. the function script wraps a patched framework in a nested directory, example is mmdetection + a custom script for cvat on top)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see the point in these tests to be honest. We don't do anything unusual with the import mechanics. If Python can import a module, then so can we (and vice versa).

Copy link
Contributor

@zhiltsov-max zhiltsov-max left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Related: #5324

@nmanovic nmanovic changed the title [Dependent] Client side lambdas Client side lambdas Jul 18, 2023


@attrs.frozen(kw_only=True)
class DetectionFunctionSpec:
Copy link
Contributor

@nmanovic nmanovic Jul 18, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From the user's perspective, I would not recommend using PatchedLabelRequest and LabeledShapeRequest. They are counterintuitive.

I will recommend defining alternative naming. Something like Label or Shape.

Another approach is to define some high-level interfaces like LabelingSpec and LabeledFrame inside SDK and use them.

In both cases, the interface will be clear and attractive.

Also, even I can read comments about DetectionFunctionContext, I'm not sure that we need it. How are you going to use it in the future?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I'll do something of a middle ground. I'll add factory functions with nice names and nice interfaces that construct the same old *Request objects. That way, the function implementation can look nice and clean, but we're not introducing any new concepts (and the users can still construct the low-level request objects if the factory functions are insufficient for some reason).

Also, even I can read comments about DetectionFunctionContext, I'm not sure that we need it. How are you going to use it in the future?

See the thread directly above this one.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll add factory functions with nice names and nice interfaces

I did it.

@@ -0,0 +1,18 @@
# Copyright (C) 2023 CVAT.ai Corporation
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's try to add a real example based on https://github.com/ultralytics/ultralytics. As you suggested, we can create a python module with a detection function that is based on ultralytics and use it to explain how to write such functions. Also, we can use it for a demonstration to our users.
Without a way to run the functionality out of the box, it will demotivate 90% of regular users.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added.

self.run_cli(
"auto-annotate",
str(fxt_new_task.id),
f"--function={__package__}.example_function",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, from a regular user perspective, I will recommend supporting at least several interfaces. I'm not sure that we have to force users to know what a Python module is. At the same time, many users know what is a directory. Do you think we can redesign the interface and use widely known concepts amount CLI users?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added an alternate option that takes a file path.

mapper.validate_and_remap(frame_shapes, sample.frame_index)
shapes.extend(frame_shapes)

client.logger.info("Uploading annotations to task %d", task_id)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here we force users to wait till all frames are annotated. In CLI we should be able to show progress and also I believe it will be useful to have a configuration option to upload data in batches. What do you think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding progress: yes we should.

Regarding uploading data in batches: I think notionally it could be useful, but in practice there are problems:

  • I'm not sure if the API supports partial uploads like this.
  • If the annotation fails mid-way, there's no way to restart it from where it left off (and it it doesn't fail mid-way, then what's the point in uploading in batches?).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added progress reporting.

@SpecLad SpecLad mentioned this pull request Jul 24, 2023
5 tasks
SpecLad added a commit that referenced this pull request Jul 28, 2023
There are several problems with how progress reporting is handled in the
SDK, both on the interface and implementation level:

* The user is supposed to know, for a given function, which units it
will report progress in. This is unnecessary coupling and prevents us
from switching to different units, or to have several progress bars
using different units.

* To create a TqdmProgressReporter, you have to create a tqdm instance,
which immediately draws a progress bar. This works poorly if the
function prints any log messages before the progress actually starts.

* There's no easy way to automatically call `finish` on a progress bar,
so some functions don't (for example, `Downloader.download_file`). This
can cause unexpected output, since tqdm will refresh the progress bar in
a background thread (possibly after we've already printed something
else).

* `iter` is basically broken, because it divides by `period`, which is 0
in all current implementations.

* Even ignoring that, it's hard to use correctly, because you need to
manually call `finish` in case of an exception.

* `split` is not implemented and not used.

* `StreamWithProgress.seek` assumes that the progress bar is at 0 at the
start, and so does `ProgressReporter.iter`. The former also works
incorrectly if the second argument is not `SEEK_SET`.

Fix these problems by doing the following:

* Add a new `start2` method which accepts more parameters. The
default implementation calls `start`, so that if a user has implemented
the original interface, it'll keep working.

* Add a `DeferredTqdmProgressReporter` that accepts tqdm parameters instead of
a tqdm instance, and only creates an instance after `start2` is called.
Use it where `TqdmProgressReporter` was used before. The old
`TqdmProgressReporter` is kept for compatibility, but it doesn't support
any `start2` arguments other than those supported by the original
`start`.

* Add a `task` context manager, which automatically calls `start2` and
`finish`. Use it everywhere instead of explicit `start`/`finish` calls.
Remove `start`/`finish` calls from `StreamWithProgress` and `iter`.

* Implement basic assertions to ensure that `start2` and `finish` are
used correctly.

* Remove `period` and `split`.

* Rewrite `StreamWithProgress.seek` and `ProgressReporter.iter` to use
relative progress reports.

These changes should be backwards compatible for users who pass
predefined or custom progress reporters into SDK functions. They are not
backwards compatible for users who try to use progress reporters
directly (e.g. calling `start`/`finish`). I don't consider that a
significant issue, since the purpose of the `ProgressReporter` interface
is for the user to get progress information from the SDK, not for them
to use it in their own code.

Originally developed for #6483.
@SpecLad SpecLad force-pushed the client-side-lambdas branch 4 times, most recently from 352358b to acf50c0 Compare July 28, 2023 16:18
@SpecLad SpecLad marked this pull request as ready for review July 28, 2023 16:22
@SpecLad SpecLad changed the title Client side lambdas Add auto-annotation support to SDK and CLI Jul 28, 2023
@codecov
Copy link

codecov bot commented Jul 28, 2023

Codecov Report

Merging #6483 (0df5821) into develop (c56f2e9) will increase coverage by 0.06%.
The diff coverage is 98.06%.

@@             Coverage Diff             @@
##           develop    #6483      +/-   ##
===========================================
+ Coverage    81.17%   81.23%   +0.06%     
===========================================
  Files          363      368       +5     
  Lines        39606    39813     +207     
  Branches      3557     3557              
===========================================
+ Hits         32150    32342     +192     
- Misses        7456     7471      +15     
Components Coverage Δ
cvat-ui 74.89% <ø> (-0.05%) ⬇️
cvat-server 86.79% <98.06%> (+0.09%) ⬆️

Comment on lines +21 to +23
return [
cvataa.rectangle(0, [1, 2, 3, 4]),
]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about changing the interface to be the following:

  • The function does not return anything
  • There is a function to add an annotation for a frame (e.g. context.add_rectangle())

This way:

  • the users won't need to know the return annotation types (there can be another function for the users who know)
  • we can change the function API without breaking it
  • we can manage annotation storage (and optimize it, potentially)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is a tempting proposal. 🤔 I'm just not sure if I'll have time to implement it; it's a pretty big change.

Introduce a `cvat-sdk auto-annotate` command that downloads data for a task,
then runs a function on the local computer on that data, and uploads
resulting annotations back to the task.

To support this functionality, add a new SDK module,
`cvat_sdk.auto_annotation`, that contains an interface that the functions
must follow, and a driver that applies a function to a task.
@SpecLad
Copy link
Contributor Author

SpecLad commented Aug 2, 2023

I added a test for the built-in yolov8n function, as well as an ultralytics extra for the SDK to support said function.

@SpecLad SpecLad merged commit 1c0a49f into cvat-ai:develop Aug 2, 2023
34 checks passed
@SpecLad SpecLad deleted the client-side-lambdas branch August 2, 2023 17:30
SpecLad added a commit that referenced this pull request Aug 8, 2023
@SpecLad SpecLad mentioned this pull request Aug 11, 2023
SpecLad added a commit that referenced this pull request Aug 11, 2023
## \[2.6.0\] - 2023-08-11

### Added

- \[SDK\] Introduced the `DeferredTqdmProgressReporter` class,
  which avoids the glitchy output seen with the `TqdmProgressReporter` under certain circumstances
  (<#6556>)
- \[SDK, CLI\] Added the `cvat_sdk.auto_annotation`
  module, providing functionality to automatically annotate tasks
  by executing a user-provided function on the local machine.
  A corresponding CLI command (`auto-annotate`) is also available.
  Some predefined functions using torchvision are also available.
  (<#6483>,
  <#6649>)
- Included an indication for cached frames in the interface
  (<#6586>)

### Changed

- Raised the default guide assets limitations to 30 assets,
  with a maximum size of 10MB each
  (<#6575>)
- \[SDK\] Custom `ProgressReporter` implementations should now override `start2` instead of `start`
  The old implementation is still supported.
  (<#6556>)
- Improved memory optimization and code in the decoding module (<#6585>)

### Removed

- Removed the YOLOv5 serverless function
  (<#6618>)

### Fixed

- Corrected an issue where the prebuilt FFmpeg bundled in PyAV
  was being used instead of the custom build.
- Fixed the filename for labels in the CamVid format (<#6600>)
mikhail-treskin pushed a commit to retailnext/cvat that referenced this pull request Oct 25, 2023
There are several problems with how progress reporting is handled in the
SDK, both on the interface and implementation level:

* The user is supposed to know, for a given function, which units it
will report progress in. This is unnecessary coupling and prevents us
from switching to different units, or to have several progress bars
using different units.

* To create a TqdmProgressReporter, you have to create a tqdm instance,
which immediately draws a progress bar. This works poorly if the
function prints any log messages before the progress actually starts.

* There's no easy way to automatically call `finish` on a progress bar,
so some functions don't (for example, `Downloader.download_file`). This
can cause unexpected output, since tqdm will refresh the progress bar in
a background thread (possibly after we've already printed something
else).

* `iter` is basically broken, because it divides by `period`, which is 0
in all current implementations.

* Even ignoring that, it's hard to use correctly, because you need to
manually call `finish` in case of an exception.

* `split` is not implemented and not used.

* `StreamWithProgress.seek` assumes that the progress bar is at 0 at the
start, and so does `ProgressReporter.iter`. The former also works
incorrectly if the second argument is not `SEEK_SET`.

Fix these problems by doing the following:

* Add a new `start2` method which accepts more parameters. The
default implementation calls `start`, so that if a user has implemented
the original interface, it'll keep working.

* Add a `DeferredTqdmProgressReporter` that accepts tqdm parameters instead of
a tqdm instance, and only creates an instance after `start2` is called.
Use it where `TqdmProgressReporter` was used before. The old
`TqdmProgressReporter` is kept for compatibility, but it doesn't support
any `start2` arguments other than those supported by the original
`start`.

* Add a `task` context manager, which automatically calls `start2` and
`finish`. Use it everywhere instead of explicit `start`/`finish` calls.
Remove `start`/`finish` calls from `StreamWithProgress` and `iter`.

* Implement basic assertions to ensure that `start2` and `finish` are
used correctly.

* Remove `period` and `split`.

* Rewrite `StreamWithProgress.seek` and `ProgressReporter.iter` to use
relative progress reports.

These changes should be backwards compatible for users who pass
predefined or custom progress reporters into SDK functions. They are not
backwards compatible for users who try to use progress reporters
directly (e.g. calling `start`/`finish`). I don't consider that a
significant issue, since the purpose of the `ProgressReporter` interface
is for the user to get progress information from the SDK, not for them
to use it in their own code.

Originally developed for cvat-ai#6483.
mikhail-treskin pushed a commit to retailnext/cvat that referenced this pull request Oct 25, 2023
Introduce a `cvat-sdk auto-annotate` command that downloads data for a task,
then runs a function on the local computer on that data, and uploads
resulting annotations back to the task.

To support this functionality, add a new SDK module,
`cvat_sdk.auto_annotation`, that contains an interface that the
functions must follow, and a driver that applies a function to a task.

This will let users easily annotate their tasks with custom DL models.
mikhail-treskin pushed a commit to retailnext/cvat that referenced this pull request Oct 25, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants