Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

C4KC-KiTS: Refinement of the submitted DICOM SEGs #2

Closed
fedorov opened this issue Jan 10, 2020 · 17 comments
Closed

C4KC-KiTS: Refinement of the submitted DICOM SEGs #2

fedorov opened this issue Jan 10, 2020 · 17 comments
Assignees
Labels
existing dataset Harmonization of a dataset already released publicly

Comments

@fedorov
Copy link
Member

fedorov commented Jan 10, 2020

Dataset: https://dx.doi.org/10.7937/TCIA.2019.IX49E8NX

Description: This collection contains subjects from the training set of the 2019 Kidney and Kidney Tumor Segmentation Challenge (KiTS19). The challenge aimed to accelerate progress in automatic 3D semantic segmentation by releasing a dataset of CT scans for 210 patients with manual semantic segmentations of the kidneys and tumors in the corticomedullary phase.

The imaging was collected during routine care of patients who were treated by either partial or radical nephrectomy at the University of Minnesota Medical Center. Many of the CT scans were acquired at referring institutions and are therefore heterogeneous in terms of scanner manufacturers and acquisition protocols. Semantic segmentations were performed by students under the supervision of an experienced urologic cancer surgeon.

Segmentations were created using dcmqi.

The issue is that segmentations are stored as sagittal series, while CT images are axial. This is one of the reasons there are difficulties loading this dataset into OHIF Viewer (see OHIF/Viewers#1345), and potentially this can cause problems for other tools and users.

Since it will be a completely lossless operation to store those segmentations as axial, should not be very difficult, and should not affect users of the collections (the dataset has just been released), it may be worthwhile to fix this now.

@fedorov fedorov added the existing dataset Harmonization of a dataset already released publicly label Jan 10, 2020
@fedorov fedorov self-assigned this Jan 10, 2020
@fedorov
Copy link
Member Author

fedorov commented Jan 10, 2020

I emailed Nick Heller asking to clarify about saggital segmentations.

@fedorov
Copy link
Member Author

fedorov commented Feb 4, 2020

Sent another reminder on Jan 15, no response.

@fedorov
Copy link
Member Author

fedorov commented Mar 29, 2020

Finally had a conversation with Nick Heller. The sagittal orientation of segmentations is an accident, and he agrees it makes a lot of sense to store those in axial orientation consistent with the imaging data orientation. The non-DICOM segmentations are in this repository: https://github.com/neheller/kits19/tree/master/data. However, due to some problem on TCIA, the images corresponding to this dataset are currently not available and appear as restricted access. Waiting to have that issue resolved.

@fedorov
Copy link
Member Author

fedorov commented Apr 22, 2020

@afshinmessiah what needs to be done:

  1. reorient segmentations from this repository https://github.com/neheller/kits19/tree/master/data to be in axial orientation - you can use itk-python to read/reorient/write NIfTI files
  2. download the CT images from the collection here: https://dx.doi.org/10.7937/TCIA.2019.IX49E8NX
  3. extract JSON metadata from the SEG files in that collection using segimage2itkimage in dcmqi
  4. run conversion of the reoriented axial SEGs using JSON metadata file and the corresponding CT series using itkimage2segimage dcmqi tool

Let me know if this makes sense! Thank you for your help with this.

@fedorov
Copy link
Member Author

fedorov commented Apr 27, 2020

Given further clarification from the data submitter, we should use the DICOM SEG content instead of the nifti files in the github repo. Let's adjust the process as follows:

  1. take the DICOM SEG series from a subject;
  2. convert it into NRRD or NIfTI using dcmseg2itkimage;
  3. reorient the segmentation;
  4. convert the reoriented volume back into SEG

You can get the CT series that corresponds to the SEG from ReferencedSeriesSequence > ReferencedInstanceSequence > SeriesInstanceUID.

Once you have the process worked out for a single case, please let me know so we can review together before proceeding with the conversion for the whole collection.

@afshinmessiah
Copy link
Contributor

Hear is the code&result for case 2 of the data:
Transform.zip

@fedorov
Copy link
Member Author

fedorov commented Apr 28, 2020

there's a mismatch in seg boundaries - I think this is because you transform the origin, and I don't think you need to do that. I think the resampler can just be initialized with the geometry of the reference image volume

image

@afshinmessiah
Copy link
Contributor

You're right. At first, I wrote the code to use this segmentation set. Since they lack the origin I couldn't use the dicom image as ref image directly. I had to take care of output image properties myself. For the dicom segmentation though, the image properties were correct and I could use dicom image as ref image for them.
The problem stems from not miscalculation but slight difference in image size in pre-transformed dicom segmentation(for case 2 size is 513, 512, 261) and corresponding NIFTI files (size is 512, 512, 261). Anyway attached the results and modified
Transform2.zip code.

@fedorov
Copy link
Member Author

fedorov commented Apr 29, 2020

Yes, looks good now - thanks! Can you make a folder named issue-2 in this repo, and put the final conversion code in that folder?

Let's wait to hear back about the issue above before proceeding with the conversion for the whole dataset.

@fedorov
Copy link
Member Author

fedorov commented May 1, 2020

@afshinmessiah I confirmed the segmentation you generated loads correctly in OHIF Viewer (you can use the link in the issue above and try yourself - it does not always work, but that is due to issues in OHIF Viewer, not the data).

image

Please go ahead with the conversion of the complete dataset!

@afshinmessiah
Copy link
Contributor

Here you can find all cases.

@fedorov
Copy link
Member Author

fedorov commented May 5, 2020

Thank you @afshinmessiah! Next time, would be great if you could upload the resulting dataset into the cloud bucket instead of Dropbox (for this one, I am already uploading). There is an issue-specific folder here where I organize data: https://console.cloud.google.com/storage/browser/tcia-idc-datareviewcoordination/?forceOnBucketsSortingFiltering=false&project=idc-tcia

@afshinmessiah
Copy link
Contributor

Sorry! @fedorov. Sure I will. I checked the link on my gmail account, says :
"Additional permissions required to list objects in this bucket: Ask a project or bucket owner to grant you 'storage.buckets.list'permissions (e.g., by giving your account the IAM Storage Object Viewer role)"

@fedorov
Copy link
Member Author

fedorov commented May 5, 2020

@wlongabaugh can you please add @afshinmessiah to the idc-tcia project with the same permissions as me?

@fedorov
Copy link
Member Author

fedorov commented May 6, 2020

@afshinmessiah
Copy link
Contributor

Done.

@fedorov
Copy link
Member Author

fedorov commented May 20, 2020

spot checks completed in OHIF viewer, dataset shared with TCIA via https://drive.google.com/drive/folders/1XiQEnGNxCCUkGK_pwjIVwJsZ-QNS7MPB?usp=sharing

@fedorov fedorov closed this as completed May 20, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
existing dataset Harmonization of a dataset already released publicly
Projects
None yet
Development

No branches or pull requests

2 participants