Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cvat_sdk task.export_dataset fails if dataset export has been called once recently. #8256

Closed
2 tasks done
davidrs opened this issue Aug 2, 2024 · 3 comments
Closed
2 tasks done
Labels
bug Something isn't working

Comments

@davidrs
Copy link

davidrs commented Aug 2, 2024

Actions before raising this issue

  • I searched the existing issues and did not find anything similar.
  • I read/searched the docs

Steps to Reproduce

Assumes from cvat_sdk import make_client being used and high level SDK

  1. Call export_dataset once: task.export_dataset("CVAT for images 1.1", dest_path_zip, include_images=False)
  2. Wait until its done.
  3. Call export_dataset a second time: task.export_dataset("CVAT for images 1.1", dest_path_zip, include_images=False)

The second call will throw an exception because it receives a 201 status since the request is already created / request is ready for download, but code assumes only good response is 202 (it received the request and will start prepping download).

expect_status(202, response)

Expected Behavior

The second call succesfully downloads dataset to dest_path_zip should work.

Possible Solution

If the downloading helper receives a 201 it should find the corresponding request and download the zip.
I'm not certain joining the request to the right id will be trivial or not. Could the 201 also return the request id?

Context

I intend to work around by writing my own code using the low level API to query requests for completed one if I get a 201 exception.

Environment

Interacting with the hosted cvat.ai app.
@davidrs davidrs added the bug Something isn't working label Aug 2, 2024
@davidrs davidrs changed the title cvat_sdk task.export_dataset fails if dataset export has been called once. cvat_sdk task.export_dataset fails if dataset export has been called once recently. Aug 2, 2024
@davidrs
Copy link
Author

davidrs commented Aug 2, 2024

Here is my workaround if it helps anyone else:


from pathlib import Path
from cvat_sdk import make_client
from cvat_sdk.core.downloading import Downloader
from cvat_sdk.api_client import Configuration, ApiClient, exceptions
...
        try:
            task.export_dataset("CVAT for images 1.1", dest_zip, include_images=False)
        except: # TODO: check error is 202 vs 201 issue
            print("Trying low level API approach")
            get_request_result(task_name, dest_zip)
...

def get_request_result(task, dest_path):
    task_id = task.id
    configuration = Configuration(
        host=HOST,
        username=CVAT_NAME,
        password=CVAT_PASSWORD,
    )

    with ApiClient(configuration) as api_client:
        try:
            (data, response) = api_client.requests_api.list(
                status="finished",
                task_id=task_id,
            )
            assert len(data['results']) == 1, "can't assume correct result if more than 1"
            result = data['results'][0]
            result_url = result['result_url']
        except exceptions.ApiException as e:
            print("Exception when calling RequestsApi.list(): %s\n" % e)

        
    with make_client(host=HOST, credentials=(CVAT_NAME, CVAT_PASSWORD)) as client:
        downloader = Downloader(client)
        downloader.download_file(result_url, output_path=Path(dest_path))

@vigi30
Copy link

vigi30 commented Aug 6, 2024

With using CVAT Version = 2.16.2,
I am facing the same issue when exporting the jobs with include_image argument to True. But when I set it to False, I am able to download the annotations.

code:

for jobs in client.jobs.list():
     if jobs.id == job_number:
         jobs.export_dataset(filename=f'{job_number}.zip', format_name='COCO 1.0', include_images=True)

Error
Exception raises
Traceback :Status Code: 201
Reason: Created
HTTP response headers: HTTPHeaderDict({'Allow': 'GET, HEAD, OPTIONS', 'Content-Length': '0', 'Cross-Origin-Opener-Policy': 'same-origin', 'Date': 'Tue, 06 Aug 2024 22:18:29 GMT', 'Referrer-Policy': 'same-origin, strict-origin-when-cross-origin', 'Server': 'nginx', 'Vary': 'Accept, Origin, Cookie', 'X-Content-Type-Options': 'nosniff, nosniff', 'X-Frame-Options': 'DENY, deny', 'X-Request-Id': '9406bba6-e97e-4e6f-8ed7-22305898df33'})

Marishka17 added a commit that referenced this issue Aug 30, 2024
…sets`|`backups` (#8255)

- Fixed exporting the same dataset or backup twice in a row using
high-level SDK (switched to new export API version) (related
#8256)
- Fixed exporting a dataset or backup using high-level SDK when the
default project or task location refers to cloud storage
- Added ability to explicitly specify location when exporting datasets
and backups using high-level SDK

## Summary by CodeRabbit

- **New Features**
- Introduced mixins for exporting datasets and downloading backups,
enhancing functionality across multiple classes.
- Added a new fixture for testing tasks with specified target storage,
improving test coverage.

- **Bug Fixes**
- Improved error handling in the file download process to ensure
validity before proceeding.

- **Refactor**
- Restructured the downloading mechanism for better modularity and
maintainability.
- Removed outdated methods in favor of mixin functionality, streamlining
class design.

- **Tests**
- Enhanced the test suite with additional scenarios and flexibility for
task management and dataset downloading.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Co-authored-by: Maxim Zhiltsov <zhiltsov.max35@gmail.com>
@Marishka17
Copy link
Contributor

Fixed by #8255

bschultz96 pushed a commit to bschultz96/cvat that referenced this issue Sep 12, 2024
…sets`|`backups` (cvat-ai#8255)

- Fixed exporting the same dataset or backup twice in a row using
high-level SDK (switched to new export API version) (related
cvat-ai#8256)
- Fixed exporting a dataset or backup using high-level SDK when the
default project or task location refers to cloud storage
- Added ability to explicitly specify location when exporting datasets
and backups using high-level SDK

## Summary by CodeRabbit

- **New Features**
- Introduced mixins for exporting datasets and downloading backups,
enhancing functionality across multiple classes.
- Added a new fixture for testing tasks with specified target storage,
improving test coverage.

- **Bug Fixes**
- Improved error handling in the file download process to ensure
validity before proceeding.

- **Refactor**
- Restructured the downloading mechanism for better modularity and
maintainability.
- Removed outdated methods in favor of mixin functionality, streamlining
class design.

- **Tests**
- Enhanced the test suite with additional scenarios and flexibility for
task management and dataset downloading.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Co-authored-by: Maxim Zhiltsov <zhiltsov.max35@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants