Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix an issue with Imagenet dataset import #4861

Merged
merged 17 commits into from
Oct 21, 2022
Merged

Conversation

yasakova-anastasia
Copy link
Contributor

@yasakova-anastasia yasakova-anastasia commented Aug 26, 2022

Motivation and context

How has this been tested?

Checklist

License

  • I submit my code changes under the same MIT License that covers the project.
    Feel free to contact the maintainers if that's a concern.

@yasakova-anastasia yasakova-anastasia marked this pull request as draft August 29, 2022 12:16
@cvat-ai cvat-ai deleted a comment from github-actions bot Sep 29, 2022
@cvat-ai cvat-ai deleted a comment from github-actions bot Oct 17, 2022
@yasakova-anastasia
Copy link
Contributor Author

/check

@github-actions
Copy link
Contributor

github-actions bot commented Oct 17, 2022

❌ Some checks failed
📄 See logs here

@yasakova-anastasia yasakova-anastasia marked this pull request as ready for review October 18, 2022 05:25
Copy link
Contributor

@zhiltsov-max zhiltsov-max left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was able to import the dataset.

However, I've found another problem: if I import an ImageNet dataset, change a label on an image, export with images as ImageNet, and then import it, the dataset misses 1 image (the changed one). The image exists in the exported archive, but the file layout is like this:

/cat/cat_1.jpg
,,,
/dog/cat/cat_0.jpg
/dog/dog_0.jpg
...

@yasakova-anastasia
Copy link
Contributor Author

@zhiltsov-max, this problem needs to be fixed in Datumaro. I created an issue.

Copy link
Contributor

@zhiltsov-max zhiltsov-max left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem still needs to be resolved. I can get the error with a dataset like this:

images.zip/
/a/b/c.png
/q/e.jpg

@zhiltsov-max
Copy link
Contributor

zhiltsov-max commented Oct 20, 2022

I see the previous problem has gone, but I was lucky enough to find another one. If we export an Imagenet dataset in CVAT format and then import the dataset, there will be an error

File "cvat/cvat/apps/dataset_manager/bindings.py", line 1685, in import_dm_annotations
    match_dm_item(item, instance_data, root_hint=root_hint))
  File "cvat/cvat/apps/dataset_manager/bindings.py", line 1636, in match_dm_item
    raise CvatImportError("Could not match item id: "
cvat.apps.dataset_manager.bindings.CvatImportError: Could not match item id: 'cat/cat_0.jpg' with any task frame

I don't think this new problem should be fixed here, let's do it in another PR to unblock this one.

@yasakova-anastasia
Copy link
Contributor Author

I don't think this new problem should be fixed here, let's do it in another PR to unblock this one.

Opened a PR with a fix for this issue.

@nmanovic nmanovic merged commit 2311b10 into develop Oct 21, 2022
@zhiltsov-max zhiltsov-max deleted the ay/fix-imagenet-import branch November 30, 2022 05:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Import dataset of Imagenet format fail
4 participants