Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ToolBox/ Scripts to extract annotations from CVAT XML files #275

Closed
pnambiar opened this issue Jan 14, 2019 · 10 comments
Closed

ToolBox/ Scripts to extract annotations from CVAT XML files #275

pnambiar opened this issue Jan 14, 2019 · 10 comments
Assignees
Labels
enhancement New feature or request
Milestone

Comments

@pnambiar
Copy link

Looks like the CVAT XML converter (to Pascal VOC) script works only on data collected in the annotation mode and not on interpolation mode. It will be very useful to have a matlab or python scripts/ toolbox available to extract bounding box coordinates files takes in interpolation mode (with track id). This will help us convert BB to whatever format we want (Eg.YOLO format)

@nmanovic nmanovic self-assigned this Jan 14, 2019
@nmanovic nmanovic added enhancement New feature or request good first issue labels Jan 14, 2019
@nmanovic nmanovic added this to the 0.4.0 - Release milestone Jan 14, 2019
@nmanovic
Copy link
Contributor

nmanovic commented Jan 16, 2019

Sample scripts can be found here: https://github.com/opencv/cvat/tree/develop/utils
In context of the task it will be necessary:

  • Prepare a python library which can parse CVAT XML annotations into some python structures (you can call it cvat/utils/cvat/parser or something like that). The library should support both interpolation and annotation CVAT XML formats.
  • Improve documentation for CVAT XML format if it is necessary (https://github.com/opencv/cvat/blob/develop/cvat/apps/documentation/xml_format.md)
  • Reuse the library for all existing converters inside cvat/utils directory (to eluminate code duplication)
  • Pascal VOC should support "interpolation mode" as well
  • Support YOLO format
  • Support TF Record format

All converters should have some tests to be sure that we don't break them in the future.

@jrjbertram
Copy link
Contributor

https://gist.github.com/jrjbertram/7cb44b58c590eefe48f44274a1dc3fee

a quick hack of converter.py that handles interpolation. this could be a stepping stone towards the PR solution.

note the hack has some rough edges. it breaks support for non-interpolated annotation, so that would need fixed in the full version. it also has hard coded video frame sizes which would also need fixed.

it wasn't clear to me if interpolated and non-interpolated annotation can be used in the same video file?

one other issue is that when CVAT is used to annotate a video and you want to generate a Pascal VOC dataset, you will have to manually extract each frame into a directory. I used ffmpeg to do this.

hopefully this helps someone!

  • Josh

@dbarrejon
Copy link

Hello everyone! I see that the our goal is pretty much the same at using this annotation tool @jrjbertram, at least sorta! I want to obtain a data set of annotated video, and using your contributoin of converting from CVAT to Pascal VOC will surely help me.
But again I still need to extract the corresponding annotated frames from the video. You have used ffmpeg to do so, right? Which command is it that you have used? Did you have to specify manually each single frame you wanted to extract, could you pass a .txt file or anything to specify it, or did you even use some python script to do so?

I would be really greatful if you could help me in this :)

Thank you very much!

Daniel

@jrjbertram
Copy link
Contributor

@100330734 I think that we've determined on the cvat chat channel that cvat uses -vsync 0, so the following should work. I'll be testing it on my dataset sometime in the next couple of days hopefully.

ffmpeg -i -vsync 0 /frame_%08d.jpg

@nmanovic
Copy link
Contributor

nmanovic commented Aug 2, 2019

I'm going to close the issue. Now CVAT supports downloading annotations in Pascal VOC directly from UI. Soon (in a week) it will support YOLO and TF records.

image

@nmanovic nmanovic closed this as completed Aug 2, 2019
TOsmanov pushed a commit to TOsmanov/cvat that referenced this issue Aug 23, 2021
* Fix prog name

* Add mark_bug test annotation
TOsmanov pushed a commit to TOsmanov/cvat that referenced this issue Aug 23, 2021
* Rename 'openvino' plugin to 'openvino_plugin' (cvat-ai#205)

Co-authored-by: Jihyeon Yi <jihyeon.yi@intel.com>

* Make remap labels more accurate, allow explicit label deletion, add docs, update tests (cvat-ai#203)

* Kate/handling multiple attributes and speed up detection split (cvat-ai#207)

* better handling multi-attributes for classification_split

* handling multi-attributes better for detection

* bugfix in calculating required number of images for splitting 2 correct side effect of the changes for re-id split

* allow multiple subsets with arbitrary names

* rename _is_number to _is_float and improve it

* Fix voc to coco example (cvat-ai#209)

* Fix export filtering

* update example in readme

* Fix export filename for LabelMe format (cvat-ai#200)

* change export filename for LabelMe format

* Allow simple merge for datasets with no labels

* Add a more complex test on relative paths

* Support escaping in attributes

* update changelog

Co-authored-by: Maxim Zhiltsov <maxim.zhiltsov@intel.com>

* split unlabeled data into subsets for task-specific splitters (cvat-ai#211)

* split unlabeled data into subsets for classification, detection. for re-id, 'not-supported' subsets for this data

* Fix image ext on saving in cvat format (cvat-ai#214)

* fix image saving in cvat format

* update changelog

* Label "face" for bounding boxes in Wider Face (cvat-ai#215)

* add face label

* update changelog

* Adding "difficult", "truncated", "occluded" attributes when converting to Pascal VOC if they are not present (cvat-ai#216)

* remove check for 'difficult' attribute

* remove check for 'truncated' and 'occluded' attributes

* update changelog

* Ignore empty lines in YOLO annotations (cvat-ai#221)

* Ignore empty lines in yolo annotations

* Add type hints for image class, catch image opening errors in image.size

* update changelog

* Classification task in LFW dataset format (cvat-ai#222)

* add classification

* update changelog

* update documentation

* Add splitter for segmentation task  (cvat-ai#223)

* added segmentation_split

* updated changelog

* rename reidentification to reid

* Support for CIFAR-10/100 format (cvat-ai#225)

* add CIFAR dataset format

* add CIFAR to documentation

* update Changelog

* add validation item for instance segmentation (cvat-ai#227)

* add validation item for instance segmentation

* Add panoptic and stuff COCO format (cvat-ai#210)

* add coco stuff and panoptic formats

* update CHANGELOG

Co-authored-by: Maxim Zhiltsov <maxim.zhiltsov@intel.com>

* update detection splitter algorithm from # of samples to # of instances (cvat-ai#235)

* add documentation for validator (cvat-ai#233)

* add documentation for validator

* add validation item description (cvat-ai#237)

* Fix converter for Pascal VOC format (cvat-ai#239)

* User documentation for Pascal VOC format (cvat-ai#228)

* add user documentation for Pascal VOC format

* add integration tests

* update changelog

* Support for MNIST dataset format (cvat-ai#234)

* add mnist format

* add mnist csv format

* add mnist to documentation

* make formats docs folder, create COCO format documentation (cvat-ai#241)

* Make formats docs folder, move format docs

* Create COCO format documentation

* Fixes in CIFAR dataset format (cvat-ai#243)

* Add folder creation

* Update changelog

* Add user documentation file and integration tests for YOLO format (cvat-ai#246)

* add user documentation file for yolo

* add integraion tests

* update user manual

* update changelog

* Add Cityscapes format (cvat-ai#249)

* add cityscapes format

* add format docs

* update changelog

* Fix saving attribute in WiderFace extractor (cvat-ai#251)

* add fixes

* update changelog

* Fix spelling errors (cvat-ai#252)

* Configurable Threshold CLI support (cvat-ai#250)

* add validator cli

* add configurable validator threshold

* update changelog

* CI. Move to GitHub actions. (cvat-ai#263)

* Moving to GitHub Actions

* Sending a coverage report if python3.6 (cvat-ai#264)

* Rename workflows (cvat-ai#265)

* Rename workflows

* Update repo config and badge (cvat-ai#266)

* Update PR template

* Update build status badge

* Fix deprecation warnings (cvat-ai#270)

* Update RISE docs (cvat-ai#255)

* Update rise docs

* Update cli help

* Pytest related changes (cvat-ai#248)

* Tests moved to pytest. Updated CI. Updated requirements.

* Updated contribution guide

* Added annotations for tests

* Updated tests

* Added code style guide

* Fix CI (cvat-ai#272)

* Fix script call

* change script call to binary call

* Fix help program name, add mark_bug (cvat-ai#275)

* Fix prog name

* Add mark_bug test annotation

* Fix labelmap parameter in CamVid (cvat-ai#262)

* Fix labelmap parameter in camvid

* Release 0.1.9 (dev) (cvat-ai#276)

* Update version

* Update changelog

* Fix numpy conflict (cvat-ai#278)

Co-authored-by: Emily Chun <emily.chun@intel.com>
Co-authored-by: Jihyeon Yi <jihyeon.yi@intel.com>
Co-authored-by: Kirill Sizov <kirill.sizov@intel.com>
Co-authored-by: Anastasia Yasakova <anastasia.yasakova@intel.com>
Co-authored-by: Harim Kang <harimx.kang@intel.com>
Co-authored-by: Zoya Maslova <zoya.maslova@intel.com>
Co-authored-by: Roman Donchenko <roman.donchenko@intel.com>
Co-authored-by: Seungyoon Woo <seung.woo@intel.com>
Co-authored-by: Dmitry Kruchinin <33020454+dvkruchinin@users.noreply.github.com>
Co-authored-by: Slawomir Strehlke <slawomir.strehlke@intel.com>
TOsmanov pushed a commit to TOsmanov/cvat that referenced this issue Aug 23, 2021
* Rename 'openvino' plugin to 'openvino_plugin' (cvat-ai#205)

Co-authored-by: Jihyeon Yi <jihyeon.yi@intel.com>

* Make remap labels more accurate, allow explicit label deletion, add docs, update tests (cvat-ai#203)

* Kate/handling multiple attributes and speed up detection split (cvat-ai#207)

* better handling multi-attributes for classification_split

* handling multi-attributes better for detection

* bugfix in calculating required number of images for splitting 2 correct side effect of the changes for re-id split

* allow multiple subsets with arbitrary names

* rename _is_number to _is_float and improve it

* Fix voc to coco example (cvat-ai#209)

* Fix export filtering

* update example in readme

* Fix export filename for LabelMe format (cvat-ai#200)

* change export filename for LabelMe format

* Allow simple merge for datasets with no labels

* Add a more complex test on relative paths

* Support escaping in attributes

* update changelog

Co-authored-by: Maxim Zhiltsov <maxim.zhiltsov@intel.com>

* split unlabeled data into subsets for task-specific splitters (cvat-ai#211)

* split unlabeled data into subsets for classification, detection. for re-id, 'not-supported' subsets for this data

* Fix image ext on saving in cvat format (cvat-ai#214)

* fix image saving in cvat format

* update changelog

* Label "face" for bounding boxes in Wider Face (cvat-ai#215)

* add face label

* update changelog

* Adding "difficult", "truncated", "occluded" attributes when converting to Pascal VOC if they are not present (cvat-ai#216)

* remove check for 'difficult' attribute

* remove check for 'truncated' and 'occluded' attributes

* update changelog

* Ignore empty lines in YOLO annotations (cvat-ai#221)

* Ignore empty lines in yolo annotations

* Add type hints for image class, catch image opening errors in image.size

* update changelog

* Classification task in LFW dataset format (cvat-ai#222)

* add classification

* update changelog

* update documentation

* Add splitter for segmentation task  (cvat-ai#223)

* added segmentation_split

* updated changelog

* rename reidentification to reid

* Support for CIFAR-10/100 format (cvat-ai#225)

* add CIFAR dataset format

* add CIFAR to documentation

* update Changelog

* add validation item for instance segmentation (cvat-ai#227)

* add validation item for instance segmentation

* Add panoptic and stuff COCO format (cvat-ai#210)

* add coco stuff and panoptic formats

* update CHANGELOG

Co-authored-by: Maxim Zhiltsov <maxim.zhiltsov@intel.com>

* update detection splitter algorithm from # of samples to # of instances (cvat-ai#235)

* add documentation for validator (cvat-ai#233)

* add documentation for validator

* add validation item description (cvat-ai#237)

* Fix converter for Pascal VOC format (cvat-ai#239)

* User documentation for Pascal VOC format (cvat-ai#228)

* add user documentation for Pascal VOC format

* add integration tests

* update changelog

* Support for MNIST dataset format (cvat-ai#234)

* add mnist format

* add mnist csv format

* add mnist to documentation

* make formats docs folder, create COCO format documentation (cvat-ai#241)

* Make formats docs folder, move format docs

* Create COCO format documentation

* Fixes in CIFAR dataset format (cvat-ai#243)

* Add folder creation

* Update changelog

* Add user documentation file and integration tests for YOLO format (cvat-ai#246)

* add user documentation file for yolo

* add integraion tests

* update user manual

* update changelog

* Add Cityscapes format (cvat-ai#249)

* add cityscapes format

* add format docs

* update changelog

* Fix saving attribute in WiderFace extractor (cvat-ai#251)

* add fixes

* update changelog

* Fix spelling errors (cvat-ai#252)

* Configurable Threshold CLI support (cvat-ai#250)

* add validator cli

* add configurable validator threshold

* update changelog

* CI. Move to GitHub actions. (cvat-ai#263)

* Moving to GitHub Actions

* Sending a coverage report if python3.6 (cvat-ai#264)

* Rename workflows (cvat-ai#265)

* Rename workflows

* Update repo config and badge (cvat-ai#266)

* Update PR template

* Update build status badge

* Fix deprecation warnings (cvat-ai#270)

* Update RISE docs (cvat-ai#255)

* Update rise docs

* Update cli help

* Pytest related changes (cvat-ai#248)

* Tests moved to pytest. Updated CI. Updated requirements.

* Updated contribution guide

* Added annotations for tests

* Updated tests

* Added code style guide

* Fix CI (cvat-ai#272)

* Fix script call

* change script call to binary call

* Fix help program name, add mark_bug (cvat-ai#275)

* Fix prog name

* Add mark_bug test annotation

* Fix labelmap parameter in CamVid (cvat-ai#262)

* Fix labelmap parameter in camvid

* Release 0.1.9 (dev) (cvat-ai#276)

* Update version

* Update changelog

* Fix numpy conflict (cvat-ai#278)

* Add changelog stub (cvat-ai#279)

* tests/requirements.py: remove the test_wrapper functions (cvat-ai#285)

* Subformat importers for VOC and COCO (cvat-ai#281)

* Document find_sources

* Add VOC subformat importers

* Add coco subformat importers

* Fix LFW

* Reduce voc detect dataset cases

* Reorganize coco tests, add subformat tests

* Fix default subset handling in Dataset

* Fix getting subset

* Fix coco tests

* Fix voc tests

* Update changelog

* Add image zip format (cvat-ai#273)

* add tests

* add image_zip format

* update changelog

Co-authored-by: Maxim Zhiltsov <maxim.zhiltsov@intel.com>

* Add KITTI detection and segmentation formats (cvat-ai#282)

* Add KITTI detection and segmentation formats

* Remove unused import

* Add KITTI user manual

Co-authored-by: Maxim Zhiltsov <maxim.zhiltsov@intel.com>

* Fix loading file and image processing in CIFAR (cvat-ai#284)

* Fix image layout and encoding problems

* Update Changelog

Co-authored-by: Maxim Zhiltsov <maxim.zhiltsov@intel.com>

* CLI tests for convert command for VOC dataset (cvat-ai#286)

* Add tests for convert command

* Convert most enum definitions from the functional style to the class style (cvat-ai#290)

* yolo format documentation update (cvat-ai#295)

* add info about coordinates in yolo format doc

* Fix merged dataset item filtering (cvat-ai#258)

* Add tests

* Fix xpathfilter transform

* Update changelog

* Sms/pytest marking cityscapes and zip (cvat-ai#298)

* Updated pytest marking for cityscapes and imagezip.

* Introduce Validator plugin type (cvat-ai#299)

* Introduce Validator plugin type

* Fix validator definitions (cvat-ai#303)

* update changelog

* Fixes in validator definitions

* Update validator cli

* Make TF availability check optional (cvat-ai#305)

* Make tf availability check optional

* update changelog

* Update pylint (cvat-ai#304)

* Add import order check in pylint

* Fix some linter problems

* Remove warning suppression comments

* Add lazy loading for builtin plugins (cvat-ai#306)

* Refactor env code

* Load builtin plugins lazily

* update changelog

* Update transforms handling in Dataset (cvat-ai#297)

* Update builtin transforms

* Optimize dataset length computation when no source

* Add filter test

* Fix transforms affecting categories

* Optimize categories transforms

* Update filters

* fix imports

* Avoid using default docstrings in plugins

* Fix patch saving in VOC, add keep_empty export parameter

* Fix flush_changes

* Fix removed images and subsets in dataset patch

* Update changelog

* Update voc doc

* Skip item transform base class in plugins

* Readable COCO and datumaro format for CJK (cvat-ai#307)

* Do not force ASCII in COCO and Datumaro JSONs for readable CJK

* Add tests

* Use utf-8 encoding for writing

Co-authored-by: Maxim Zhiltsov <maxim.zhiltsov@intel.com>

* Force utf-8 everywhere (cvat-ai#309)

* Fix in ImageNet_txt (cvat-ai#302)

* Add extensions for images to annotation file

* Remove image search in extractor

* Update changelog

Co-authored-by: Maxim Zhiltsov <maxim.zhiltsov@intel.com>

* Reduce duplication of dependency information (cvat-ai#308)

* Move requirements from setup.py to requirements-base.txt

* Add whitespace error checking to GitHub Actions (cvat-ai#311)

* Fix whitespace errors

As detected with `git diff --check`.

* Add a job to check for whitespace errors

I called it "lint" so that other checks could be added to it later.

* Bump copyright years in changed files

* Add initial support for the Open Images dataset (cvat-ai#291)

* Support reading or Labels in Open Images (v4, v5, v6)

* Add tests for the Open Images extractor/importer

* Add Open Images documentation

* Update changelog

* Fix tensorboardX dependency (cvat-ai#318)

* Fixing remark-lint issues. Adding remark-linter check. (cvat-ai#321)

* Fix remark-lint issues.

* Align continuation lines with the first line.

Apply comments

* Added remark check

* Add an upper bound on the Pillow dependency to work around a regression in 8.3 (cvat-ai#323)

* open_images_user_manual.md: fix image description file URLs

I accidentally swapped the URLs for test and validation sets.

* Fix COCO Panoptic (cvat-ai#319)

* add test

* Fix integer overflow in bgr2index

* Fix pylint issues. Added pylint checking. (cvat-ai#322)

* Added pylint job for CI

* Rework pip install

* Fixed remaining pylint warnings

Co-authored-by: Andrey Zhavoronkov <andrey.zhavoronkov@intel.com>

* Open Images: add writing support (cvat-ai#315)

* open_images_user_manual.md: fix image description file URLs

* open_images_format: add conversion support

* open_images_format: add support for images in subdirectories

* open_images_format: add tests for writing support

* open_images_format: add documentation for the writing support

* Update the changelog entry for the Open Images support

* Add python bandit checks. (cvat-ai#316)

* Add bandit dependency

* Add bandit checks on CI

* Disable some warnings

Co-authored-by: Andrey Zhavoronkov <andrey.zhavoronkov@intel.com>
Co-authored-by: Maxim Zhiltsov <maxim.zhiltsov@intel.com>

* Remove Pylint unused-import warning suppressions (cvat-ai#326)

* Remove Pylint unused-import warning suppressions

* Add a job to check import formatting using isort (cvat-ai#333)

* Reformat all imports using isort

* Implement a workflow for checking import formatting based on isort

* Reformat the enabled checker list in .pylintrc (cvat-ai#335)

Put each code on its own line and add a comment with its symbolic name.
That makes the list more understandable and easier to edit.

* Merge all linting jobs into one workflow file (cvat-ai#331)

Doing it this way means that on GitHub's Checks page, all jobs are displayed
under one "Linter" category, instead of multiple indistinguishable "Linter"
categories with one job each.

Move the whitespace checking job into the Linter workflow as well, since
that's where it logically belongs.

I also took the opportunity to slightly rename the jobs in order to spell
the linter names correctly.

* Fix cuboids / 3d / M6 (cvat-ai#320)

* CVAT-3D Milestone-6: Added Supervisely Point Cloud and KITTI Raw 3D formats

* Added Cuboid3d annotations

* Added docs for new formats

Co-authored-by: cdp <cdp123>
Co-authored-by: Jayraj <jayrajsolanki96@gmail.com>
Co-authored-by: Roman Donchenko <roman.donchenko@intel.com>

* Clean up .pylintrc (cvat-ai#340)

* Clean up the list of messages in .pylintrc

* Remove obsolete Pylint options

* .pylintrc: move the disable setting and its documentation together

* Remove the commented-out setting.

* Revert "Add an upper bound on the Pillow dependency to work around a regression in 8.3 (cvat-ai#323)" (cvat-ai#341)

The regression was fixed in 8.3.1.

This reverts commit 9a85616.

* Enable pylint checkers that find invalid escape sequences (cvat-ai#344)

Fix the issues that they found.

* Factor out the images.meta loading code from YoloExtractor (cvat-ai#343)

* Factor out the images.meta loading code from YoloExtractor

It looks like the same thing will be needed for Open Images, so I'm
moving it to a common module.

* Rework image.meta parsing code to use shell syntax

This allows comments and improves extensibility.

* Support for CIFAR-100 (cvat-ai#301)

* Add support for CIFAR-100

* Update Changelog

* Update user_manual.md

* Add notes about differences in formats

* Fix importing for VGG Face 2 (cvat-ai#345)

* correct asset according the original vgg_face2 dataset

* fix importing of the original dataset

Co-authored-by: Maxim Zhiltsov <maxim.zhiltsov@intel.com>

* Dataset caching fixes (cvat-ai#351)

* Fix importing arbitrary file names in COCO subformats

* Optimize subset iteration in a simple scenario

* Fix subset iteration in dataset with transforms

* Cuboid 3D for Datumaro format (cvat-ai#349)

* Support cuboid_3d and point cloud in datumaro format

* Add cuboid_3d and point cloud tests in datumaro format

* Add image size type conversions

Co-authored-by: Maxim Zhiltsov <maxim.zhiltsov@intel.com>

* Add e2e tests for cuboids (cvat-ai#353)

* Add attr name check in kitti raw

* Add sly pcd e2e test

* Rename "object" attribute to "track_id" in sly point cloud

* Add kitti raw e2e test

* Update kitti raw example

* update changelog

* Release 0.1.10 (dev) (cvat-ai#354)

* Update changelog

* Add cifar security notice

* Update version

Co-authored-by: Emily Chun <emily.chun@intel.com>
Co-authored-by: Jihyeon Yi <jihyeon.yi@intel.com>
Co-authored-by: Kirill Sizov <kirill.sizov@intel.com>
Co-authored-by: Anastasia Yasakova <anastasia.yasakova@intel.com>
Co-authored-by: Harim Kang <harimx.kang@intel.com>
Co-authored-by: Zoya Maslova <zoya.maslova@intel.com>
Co-authored-by: Roman Donchenko <roman.donchenko@intel.com>
Co-authored-by: Seungyoon Woo <seung.woo@intel.com>
Co-authored-by: Dmitry Kruchinin <33020454+dvkruchinin@users.noreply.github.com>
Co-authored-by: Slawomir Strehlke <slawomir.strehlke@intel.com>
Co-authored-by: Jaesun Park <diligensloth@gmail.com>
Co-authored-by: Andrey Zhavoronkov <andrey.zhavoronkov@intel.com>
Co-authored-by: Jayraj <jayrajsolanki96@gmail.com>
@tanmayGIT
Copy link

tanmayGIT commented Jun 5, 2023

Hi,
I am also facing a similar problem. I have annotations on the image level. I have labeled some images as invalid during the annotation.
Screenshot from 2023-06-05 09-37-03

But when I export the "Pascal VOC XML" format, I don't get the information, related to these image-level labels i.e. "unvalid" or "invalid"

Any idea?

@nmanovic
Copy link
Contributor

nmanovic commented Jun 5, 2023

@tanmayGIT , how should information about image-level labels be present in Pascal VOC XML format?

@tanmayGIT
Copy link

Thanks for your response. Then how could I obtain the image level label information from the exported annotations? Should I use some other annotation formation to get such information ?
I really appreciate any help you can provide.

@tanmayGIT
Copy link

Hello again,
In fact, I got the answer. If we export in "CVAT for Images 1.1" format then we will have the information at the image level as well as the annotation level. Then, we just need to parse the XML and convert it into the desired format. For example in my case, I needed it in the YOLO format.

MicrosoftTeams-image

@tanmayGIT
Copy link

tanmayGIT commented Jun 8, 2023

Here is the code, I wrote to parse the CVAT formatted XML file, it may help others :
Note the image level tag is obtained as: image_tag_meta = elem.findall("tag"). Then, I filter out the

unvalid i.e. invalid tags.

    def parse_cvat_xml(xml_file):
        """
            Function to read the XML file and extract the information in dictionary format:
            Args:
                xml_file: The path of the XML file
            Returns:
        """
        root = EleTree.parse(xml_file).getroot()

        keep_all_valid_imag_info_together = []
        get_meta_data = root.findall("image")

        # Parse the XML Tree
        for elem in get_meta_data:

            image_whole_info = {}

            image_id = int(elem.attrib["id"])
            image_full_path = elem.attrib["name"]
            image_width = int(elem.attrib["width"])
            image_height = int(elem.attrib["height"])

            image_whole_info["image_id"] = image_id
            image_whole_info["full_path"] = image_full_path
            image_whole_info["image_width"] = image_width
            image_whole_info["image_height"] = image_height

            imag_id_valid_flag = AnalyzeAnnotationFiles.verify_annotation_image_ids_range(image_id)
            if not imag_id_valid_flag:  # if the image id flag is not valid i.e. the image is out of required range
                continue

            image_tag_meta = elem.findall("tag")

            unvalid_flag = False
            if any(True for _ in image_tag_meta):

                for get_tag in image_tag_meta:
                    obtain_tag = get_tag.attrib["label"]
                    if obtain_tag == "unvalid":
                        unvalid_flag = True
                        break

            if unvalid_flag:
                continue

            image_level_all_bbox_yolo_coords = {'yolo_coords': []}
            image_level_all_bbox_attrib = {'boox_attrib': []}

            object_metas = elem.findall("box")
            for bbox in object_metas:

                box_label = bbox.attrib["label"]
                xtl = float(bbox.attrib["xtl"])
                ytl = float(bbox.attrib["ytl"])
                xbr = float(bbox.attrib["xbr"])
                ybr = float(bbox.attrib["ybr"])

                # CVAT to yolo
                yolo_x = round(((xtl + xbr) / 2) / image_width, 6)
                yolo_y = round(((ytl + ybr) / 2) / image_height, 6)
                yolo_w = round((xbr - xtl) / image_width, 6)
                yolo_h = round((ybr - ytl) / image_height, 6)

                # Keeping all the yolo coords
                image_level_all_bbox_yolo_coords['yolo_coords'].append([box_label, yolo_x, yolo_y, yolo_w, yolo_h])

                bbox_metas = bbox.findall("attribute")

                keep_each_attribute = {}
                for attr in bbox_metas:
                    attribute_name = attr.attrib["name"]
                    attribute_value = attr.text

                    keep_each_attribute[attribute_name] = attribute_value

                image_level_all_bbox_attrib['boox_attrib'].append(keep_each_attribute)

            # keeping all the bbox yolo coordinates info of the image
            image_whole_info["all_bbox_yolo_coords"] = image_level_all_bbox_yolo_coords

            # keeping all the bbox attribute info of the image
            image_whole_info["all_bbox_attributes_info"] = image_level_all_bbox_attrib

            keep_all_valid_imag_info_together.append(image_whole_info)

        return keep_all_valid_imag_info_together

See the final dictionary will look like this for all the images in the dataset (e.g. 2107 images here)
Screenshot from 2023-06-08 11-33-29
Then let's say the information of the 7th image looks like this; where I keep the information of image id, image path, width, and height. I also keep the information, related to all the bounding box coordinates in the images in the dictionary, called "all_bbox_yolo_coords" and the attribute information of each bounding box is kept in the dictionary, called "all_box_attributes_info"

Screenshot from 2023-06-08 11-33-56

The "all_bbox_yolo_coords" dictionary looks like this, where the information, related to all the bounding boxes is kept :

Screenshot from 2023-06-08 11-34-15

The dictionary of all the attributes, related to each bounding box is kept like this :

Screenshot from 2023-06-08 11-34-35

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

6 participants