Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error Parsing DICOM with implicit transfer syntax and non-standard tag #220

Open
klibertowski opened this issue Oct 8, 2021 · 15 comments
Open
Labels
bug Something isn't working enhancement New feature or request

Comments

@klibertowski
Copy link

I'm getting an error error reading value unexpected EOF when attempting to parse a DICOM.

Screen Shot 2021-10-08 at 4 47 28 PM

After doing a bit of debugging, I determined the problem is occurring because the DICOM has an implicit transfer syntax and a tag that is outside of the DICOM standard (0029,1140 ApplicationHeaderSequence SQ). This is causing the VR to be "UN" and the tag to be read as a string with a length of 4GB, which obviously fails.

Is there a way I can add a custom tag to the DICOM dict so it is correctly identified with a VR of "SQ"? Or is there another way around this problem?

I can upload the DICOM I'm getting this error on later if that will be helpful.

@suyashkumar suyashkumar added bug Something isn't working enhancement New feature or request labels Oct 23, 2021
@suyashkumar
Copy link
Owner

suyashkumar commented Oct 23, 2021

Hi @klibertowski thanks for reporting!

This is an interesting situation indeed, and I wonder how other DICOMs handle this in general--my guess is that for private tags like this it would probably be easiest if they were encoded in explicit transfer syntax at the start, as I'm not sure how parsers are supposed to even read these (or parse the whole dicom correctly) in the general case in an implicit transfer syntax setting without having special lookup tables as you suggest. I'm not familiar with existing opinions on how to handle things like this or what the standard suggests for private tags in an implicit transfer syntax, so need to do some reading on that.

Thoughts on a solution

Anyhow, I think this is something we should make easy for our users, even if unconventional. Assuming there isn't another opinion on how to handle these cases in general (for which I'll need to do some more reading), I think it could be reasonable for us to introduce a new ParseOption that can take a side input of private tag information that could be used by the Parser to parse DICOMs along the lines of what you suggest. Shouldn't be too difficult, imo. I can take a look at this, perhaps after taking a quick look around to see what context there might be on this kind of thing from the parser perspective.

@klibertowski
Copy link
Author

Hi @suyashkumar, thanks for looking into this!

I did a bit of research and it seems pydicom has recently ran into this issue as well. Their solutions was to parse UN tags with undefined length as SQ tags. This is based on this dicom standard.

This seems like a good solution since it doesn't require the user to manually input private tags.

@suyashkumar
Copy link
Owner

@klibertowski fantastic, thanks so much for looking into that! Indeed I was wondering about this, and figured there would be an opinion on it--this seems like a reasonable and at the very least consistent approach! I think we should go with that. I wonder if there will ever be UN tags that are not undefined length SQ (in which case extra info would be needed to parse them correctly--but it sounds like they would be non compliant if that was the case).

@suyashkumar
Copy link
Owner

suyashkumar commented Oct 29, 2021

Also, if there is an example DICOM with such a tag you can find or freely contribute to this repository (e.g. has the correct licensing to allow that, and is de-identified), it could be helpful to add it to the test set. We cite all the sources and licenses for any such DICOMs as well here and here. We'd have to do this after the fix goes in, but could be handy to have when kicking the tires on a fix too, in addition to unit tests.

The cancer imaging archive (TCIA) is usually a great source of DICOMs, if you happen to have run across any from there with such an unknown tag. We can always synthetically create one too if needed. I'll probably hunt around TCIA when I get a chance sometime next week to see if I can find something!

(If you don't have something handy, no worries at all--I know it can be challenging to find such dicoms that are "contributable" licensing and de-id wise, and I'll take a closer look when I can).

@klibertowski
Copy link
Author

I tried anonymizing the DICOM I was getting the problem on using Osirix and DicomBrowser. However, when I did this I was no longer getting the error when trying to parse it. The DICOM editing programs kept changing the VL on the tag I was getting the issue on from undefined to a set value when exporting for some reason. So it seems we might need to find another DICOM for testing or make our own as you mentioned.

I did implement a quick fix based on what we discussed and it worked when testing with the original DICOM I was having issues with. So at least we know that should fix the problem.

@klibertowski
Copy link
Author

I had some time to spend on this and was able to get an anonymized DICOM with the correct tag info by manually editing some values. The DICOM should be safe to use without any licenses.

I could write some tests and put in a PR for this issue if you'd like?

@suyashkumar
Copy link
Owner

Thanks @klibertowski! hm, I think it might be safer to not use that particular DICOM without an explicit license from the owner, even if de-identified. However, writing unit tests for the fix (and confirming it works on the dicom if you run it locally) I think would be sufficient at this time.

Thanks again!

I took a tiny stab at this earlier, but need to make some adjustments and write unit tests, which I can try to do this week, or next week if I have some downtime! You are also welcome to send a PR with a fix and unit tests if you would like and have the time, but no pressure if not (though, do let me know in advance if you're going to send something so we don't overlap).

Thanks!

@klibertowski
Copy link
Author

Hi @suyashkumar, I'll let you send the PR whenever you have time since you started working on it. Thanks for your help on this!

@krishpranav
Copy link

krishpranav commented Dec 29, 2021

hey @everyone I am facing this issue while parsing dcm to jpg

  • code:
package main

import (
	"fmt"
	"image/jpeg"
	"os"

	"github.com/suyashkumar/dicom"
	"github.com/suyashkumar/dicom/pkg/tag"
)

func main() {
	dataset, _ := dicom.ParseFile("dicom/2.dcm", nil)
	pixelDataElement, _ := dataset.FindElementByTag(tag.PixelData)
	pixelDataInfo := dicom.MustGetPixelDataInfo(pixelDataElement.Value)
	for i, fr := range pixelDataInfo.Frames {
		img, _ := fr.GetImage()
		f, _ := os.Create(fmt.Sprintf("image_%d.jpg", i))
		_ = jpeg.Encode(f, img, &jpeg.Options{Quality: 100})
		_ = f.Close()
	}
}

@krishpranav
Copy link

  • it is giving the output file as image_0.jpg

@krishpranav
Copy link

image_0

@krishpranav
Copy link

2

@krishpranav
Copy link

image_0

it is the output by running the code

@krishpranav
Copy link

2

I used another converter online

@suyashkumar
Copy link
Owner

suyashkumar commented Dec 29, 2021

Hi @krishpranav are you getting the EOF error reported earlier, or just an apparently dark image? The image appearing black may be working as intended, because we do not apply any pixel value auto scaling yet:

Note: for some DICOMs (with native pixel data) no automatic intensity scaling is applied yet (this is coming). You can apply this in your image viewer if needed (in Preview on mac, go to Tools->Adjust Color).

Also, regarding the images you posted, by posting them I assume you have the license to post those and that they are de-identified (they appear to be, but that is for you to determine as the poster--if not, please delete those comments).

If you're not getting the EOF error, and it is not working-as-intended due to no auto scaling being applied, please open another issue instead of this one. Thanks!

suyashkumar added a commit that referenced this issue Jun 9, 2024
This addresses #220 (and one of the dicoms in #327) by treating elements with a VR=UN and undefined VL as Sequences.

This pulls from work that I and @jabillings did previously (#235).

Some follow on changes may be needed to support this part of [the spec CP](https://dicom.nema.org/dicom/cp/cp246_01.pdf):
> If at some point an application knows the actual VR for an Attribute of VR UN (e.g. has its own applicable data dictionary), it can assume that the Value Field of the Attribute is encoded in Little Endian byte ordering with implicit VR encoding, irrespective of the current TransferSyntax.

---------

Co-authored-by: jabillings <jabillings@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants