Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cvat-cli --sorting-method predefined results in seemingly random order of images #5061

Closed
nikste opened this issue Oct 8, 2022 · 10 comments · Fixed by #5083
Closed

cvat-cli --sorting-method predefined results in seemingly random order of images #5061

nikste opened this issue Oct 8, 2022 · 10 comments · Fixed by #5083
Assignees
Labels
bug Something isn't working sdk/cli server

Comments

@nikste
Copy link

nikste commented Oct 8, 2022

My actions before raising this issue

using the current master (v.2.2.0) and cvat-cli for create task and sorting method predefined will result in a seemingly random order of images in the task.
Using version 2.1.0 works as expected but only sometimes.
(creating many tasks with a script)

@nikste nikste changed the title cvat-cli --sorting-method predefined not working as expected cvat-cli --sorting-method predefined results in seemingly random order of images Oct 8, 2022
@zhiltsov-max zhiltsov-max added the bug Something isn't working label Oct 10, 2022
@zhiltsov-max
Copy link
Contributor

zhiltsov-max commented Oct 10, 2022

Hi, thanks for reporting the problem! I was able to reproduce. There is no current workaround with CLI or SDK core API yet, but if it fits to your data, please consider the natural sorting.

@zhiltsov-max zhiltsov-max self-assigned this Oct 10, 2022
@nikste
Copy link
Author

nikste commented Oct 10, 2022

so using master v.2.1.0 utils/cli/cli.py (and the cvat server side v2.2.0) it seems to upload "less" random: either in the order specified or inverse. On first glance this looks like something went wrong in the cli and not the server.

For people with the same problem: After creating the task with cli.py .. create .. I could identify the order by checking if the first picture was correct using cli.py ... dump .. and checking the image names in the json. If its still the wrong order use cli.py .. create .. with the inverse order than the call before and again check the order.

@zhiltsov-max
Copy link
Contributor

No, the reason for this problem is different, and behavior depends on the OS implementation. It can be reproduced both in UI and SDK, all the versions since the TUS protocol support introduction are affected. The file order is random, though it can be partially sorted. One way to avoid is to send all files in a single POST tasks/{id}/data request.

@nikste
Copy link
Author

nikste commented Oct 10, 2022

Ok I'm not an expert in this :) My observation was that the cli in the previous version would either have the correct sorting or inverse sorting, but i guess it will send everything in one post request. With the newest cli version it looks completely random to me. But i didn't look into the code and how its handled at all.

@nikste
Copy link
Author

nikste commented Oct 10, 2022

thanks for looking into this though!

@zhiltsov-max
Copy link
Contributor

It's still possible to create task with older versions of CLI (the ones that were placed in utils/cli/), they can be obtained only from this repository at specific commits (prior to v2.2.0, e.g. here). These versions sent all data in a single request, so were not affected by this problem.

@nikste
Copy link
Author

nikste commented Oct 10, 2022

what i was observing is, that they (v2.1.0) does maintain order, but its either as specified or the reverse order for some reason, i suspect then there to be another issue in that case (at least with version v2.1.0).

@nikste
Copy link
Author

nikste commented Oct 10, 2022

random as in for some tasks you need to specify the image order in reverse and for some others not, it seems to be pretty stable for creation with the same cli.py .. create .. command

@AljoSt
Copy link

AljoSt commented Apr 26, 2023

is there any update on this? With the current version of the sdk (2.4.0) using predefined sorting method still doesn't work

@AljoSt
Copy link

AljoSt commented May 17, 2023

Just in case other people come here: job_file_mapping(https://opencv.github.io/cvat/docs/api_sdk/sdk/reference/models/data-request/) can be used for this

SpecLad added a commit that referenced this issue Jun 8, 2023
Fixes #5061, #4179

- Added a way to declare custom file ordering for the local task data
uploads via TUS protocol
- Added an option to use a manifest to support the `predefined` sorting
method
- This file is required for the `predefined` sorting mode with image
archives
- Fixed file ordering when tasks are created from SDK or CLI in the
`predefined` sorting mode
- Added more tests for task data uploading API

The uploading protocol is implemented:

The user specifies `sorting_method=predefined` if the task creation
request. Then the data is uploaded.

1. Client files uploading
1.1. The files are uploaded as separate files (using the TUS protocol)
or grouped files (using the `Upload-Multiple` requests).
1.2. The `Upload-Finish` request comes (or its unlabeled legacy
equivalent). The new optional field can be supplied: `upload_file_order`
- a list of strings. It allows to override the input file order, if
necessary, and is only valid with the `predefined` sorting method
specified.
1.2.1. If the field is empty or missing, the client files in the data
requests are considered ordered.
1.2.2. If the field is not empty, a list containing the file list in the
required order is expected in the `upload_file_order` field.
1.2.2.1. If there are `client_files` in the request, the files are
sorted
1.2.2.2. If file lists mismatch, an explanatory error is raised.

2. Data processing
2.1. At this point, all `*_files` are considered ordered as requested.
2.2. Require a metafile for zip uploads with predefined sorting. The
file is expected to accompany the uploaded zip file, not to be inside of
the archive.
2.3. If there is a metafile in the input data, files are ordered after
the metafile.
2.3.1. If the data is extracted from cloud, only the specified subset of
the files is kept in the manifest.
2.3.2. If the upload data doesn't exist in the metafile, an error is
raised.
2.3.3. A `job_file_mapping` has higher priority than metafile, if
specified.

Co-authored-by: Roman Donchenko <roman@cvat.ai>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working sdk/cli server
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants