Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dataset HU range #27

Open
kirmans opened this issue Jun 24, 2021 · 1 comment
Open

Dataset HU range #27

kirmans opened this issue Jun 24, 2021 · 1 comment
Assignees
Labels
question Further information is requested

Comments

@kirmans
Copy link

kirmans commented Jun 24, 2021

Hi @frankkramer

When I check the dataset, There are 2 part coranacases and radiopedia. For the radiopedia part the images set up 0-255. But for the coronacases part HU range -1250 to 250. I wonder how do you overcome this problem.

@kirmans kirmans changed the title Dataset Dataset HU range Jun 24, 2021
@muellerdo
Copy link
Member

muellerdo commented Jun 24, 2021

Hi @kirmans,

thanks for your interest in our study.

You are right, the Ma et al. dataset consists of coronacases and radiopedia data.
Whereas the radiopedia data was already normalized to grayscale, the coronacases had the original HU ranging. However, they were not clipped originally between -1250 to 250, they had a normal CT ranging from -1000 up to 8000.

As you already mentioned, we performed at first a clipping approach on the all samples to -1250 to 250. Ideally, only the coronacases should be clipped (which we clipped on the result data for our publication), but we noticed that high value intensity (250-255) regions on the radiopedia volumes are not performance related, which is why we published this simplistic approach on just clipping all samples (coronacases and radiopedia) on -1250 to 250. This cut only the top 5 intensity values on the grayscale normalized radiopedia data, but performed a reasonable clipping on the coronacases.

Paper extract:

We exploited the Hounsfield units (HU) scale byclipping the pixel intensity values of the images to -1,250 asminimum and +250 as maximum, because we wereinterested in infected regions (+50 to +100 HU) and lungregions (-1,000 to -700 HU). It was only possible to applythe clipping approach on the Coronacases Initiative CTs,because the Radiopaedia volumes were already normalizedto a grayscale range between 0 and 255.

Therefore, long story short: We performed only clipping on coronacases and none on radiopedia, sadly.
We also performed grayscale normalization afterwards on the coronacases in order to be equally processed as the radiopedia volumes. Still, we further normalized both of them via Z-Score as final step to increase efficiency of the model fitting process.

Here is also a small output of our data exploration, which can be reproduced by running scripts/data_exploration.py:

                                 vol_shape  vol_minimum  vol_maximum  \
coronacases_001         (512, 512, 301, 1)      -1021.0       2996.0   
coronacases_002         (512, 512, 200, 1)      -1023.0       9567.0   
coronacases_003         (512, 512, 200, 1)      -1023.0       8931.0   
coronacases_004         (512, 512, 270, 1)      -1021.0       2020.0   
coronacases_005         (512, 512, 290, 1)      -1021.0       5528.0   
coronacases_006         (512, 512, 213, 1)      -1023.0       2217.0   
coronacases_007         (512, 512, 249, 1)      -1023.0       2515.0   
coronacases_008         (512, 512, 301, 1)      -1021.0       8575.0   
coronacases_009         (512, 512, 256, 1)      -1021.0       1845.0   
coronacases_010         (512, 512, 301, 1)      -1021.0       1920.0   
radiopaedia_10_85902_1   (630, 630, 39, 1)          0.0        255.0   
radiopaedia_10_85902_3  (630, 630, 418, 1)          0.0        255.0   
radiopaedia_14_85914_0  (630, 401, 110, 1)          0.0        255.0   
radiopaedia_27_86410_0   (630, 630, 66, 1)          4.0        255.0   
radiopaedia_29_86490_1   (630, 630, 42, 1)          0.0        255.0   
radiopaedia_29_86491_1   (630, 630, 42, 1)          0.0        255.0   
radiopaedia_36_86526_0   (630, 630, 45, 1)          0.0        255.0   
radiopaedia_40_86625_0   (630, 630, 93, 1)         12.0        255.0   
radiopaedia_4_85506_1    (630, 630, 39, 1)          0.0        255.0   
radiopaedia_7_85703_0    (630, 630, 45, 1)          0.0        255.0   

If you have more questions, feel free to ask.

Cheers,
Dominik

@muellerdo muellerdo self-assigned this Jun 24, 2021
@muellerdo muellerdo added the question Further information is requested label Jun 24, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants