Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KK discrepancies #579

Closed
ntustison opened this issue Mar 16, 2024 · 14 comments · Fixed by #590
Closed

KK discrepancies #579

ntustison opened this issue Mar 16, 2024 · 14 comments · Fixed by #590
Assignees
Labels

Comments

@ntustison
Copy link
Member

Okay, just a quick update on this issue. It appears that there's something about running KK through the ANTsPy or ANTsR interface which causes a "failure", i.e. instances where the thickness value(s) blow up beyond what is expected. Although KK is deterministic, the problem occurs completely at random, occurs with a single thread or multiple threads, and the deviation from the correct or expected thickness image occurs in the first couple of optimization iterations. Also, as a check, it does not occur when running it via the command line.

Below are minimal examples comparing the three scripts running KK on the same data in a loop of 10 iterations: shell, python, and R. On my machine, the shell script produces 10 images which are exactly the same and have the expected values. The R and Python scripts produce the expected image anywhere from 50% to 80% of the time (very loose estimate).

I'll be diving deeper and seeing which exact internal image variable deviates but just wanted to provide an update. Also, any thoughts are welcome on why running KK through R and Python causes this.

Shell

#! /usr/bin/sh

export ITK_GLOBAL_DEFAULT_NUMBER_OF_THREADS=8

inDirectory=./

seg=${inDirectory}/seg.nii.gz
gm=${inDirectory}/gm.nii.gz
wm=${inDirectory}/wm.nii.gz

iter=3

for i in {0..9};
  do 
    echo "KKsh $i"
    kk=${inDirectory}/sh_kk_${i}.nii.gz
    KellyKapowski -d 3 \
                  -s $seg \
                  -g $gm \
                  -w $wm \
                  -v 1 \
                  -m 1.5 \
                  -r 0.0025 \
                  -t 10 \
                  -c [${iter}] \
                  -x 0 \
                  -o $kk 
  done

Python

import ants
import os

os.environ['ITK_GLOBAL_DEFAULT_NUMBER_OF_THREADS'] = '8'

in_directory = "./"

seg = ants.image_read(in_directory + "seg.nii.gz")
gm = ants.image_read(in_directory + "gm.nii.gz")
wm = ants.image_read(in_directory + "wm.nii.gz")

iter=3

for i in range(10):
    print("KKpy " + str(i))
    kk = ants.kelly_kapowski(s=seg, g=gm, w=wm,
                            its=iter, r=0.0025, m=1.5, x=0, t=10,
                            verbose=1)
    ants.image_write(kk, in_directory + "py_kk_" + str(i) + ".nii.gz")

R

library( ANTsR )

Sys.setenv(ITK_GLOBAL_DEFAULT_NUMBER_OF_THREADS = "8")

inDirectory <- "./"

seg <- antsImageRead( paste0( inDirectory, "seg.nii.gz" ) )
gm <- antsImageRead( paste0( inDirectory, "gm.nii.gz" ) )
wm <- antsImageRead( paste0( inDirectory, "wm.nii.gz" ) )

iter <- 3

for( i in seq.int(9) )
  {
  kk <- kellyKapowski( s = seg, g = gm, w = wm, its = iter, 
                       r = 0.0025, m = 1.5, x = FALSE, t = 10, 
                       verbose = TRUE )
  antsImageWrite( kk, paste0( inDirectory, "r_kk_", i, ".nii.gz") )   
  }

gm.nii.gz
seg.nii.gz
wm.nii.gz

@ncullen93
Copy link
Member

Other people have definitely posted issues with things being different or memory issues in python or R... It must be related to the wrappers, right? Perhaps the C++ versions need to be updated?

@cookpa
Copy link
Member

cookpa commented Mar 16, 2024

As the OP in #552 tried to warn us, it is some kind of memory leak in ANTsPy and presumably R too.

I modified the script slightly

import sys
import os

os.environ['ITK_GLOBAL_DEFAULT_NUMBER_OF_THREADS'] = '4'

import ants

in_directory = "./"

seg = ants.image_read(in_directory + "seg.nii.gz")
gm = ants.image_read(in_directory + "gm.nii.gz")
wm = ants.image_read(in_directory + "wm.nii.gz")

iter=3

label=sys.argv[1]

print("KKpy " + label)
kk = ants.kelly_kapowski(s=seg, g=gm, w=wm,
                        its=iter, r=0.0025, m=1.5, x=0, t=10,
                        verbose=1)
ants.image_write(kk, in_directory + "results/py_kk_" + label + ".nii.gz")

This gives identical results if I run a bash for loop calling this multiple times. Iif I run a for loop inside python, it varies some of the time.

@ntustison
Copy link
Member Author

Thanks @cookpa and @ncullen93 for looking at this. I've narrowed it down a bit in the actual ITK filter but am still completely perplexed as to why this is happening.

@cookpa, I pulled out the for loop from my original Python script and it looks as though the problem might persist. I'm double checking that right now.

@cookpa
Copy link
Member

cookpa commented Mar 16, 2024

You're right. It happened 0 out of 10 times last night, but today I tried it again and it failed 1 out of 50 times. The 49 other runs were identical to KellyKapowski

@ntustison
Copy link
Member Author

Thanks @cookpa for the confirmation.

To add to my confusion, I'm currently running the original Python script where the for loop is internal to the Python script on my GPU linux box. So far I'm 40 iterations in and it hasn't failed once.

@cookpa
Copy link
Member

cookpa commented Mar 17, 2024

Bizarre. I haven't run on Linux but on my Mac, running with a for loop in Python produces different results nearly every run. Without a for loop the failure rate is approximately 1%, it's otherwise consistent with the CLI. I don't get it at all

@cookpa
Copy link
Member

cookpa commented Mar 17, 2024

I tried re-reading the input images at every iteration, but it doesn't help. So I don't think the input images are changing during the for loop.

@ntustison
Copy link
Member Author

Yeah, I initially tried that but ultimately put image reading outside the loop to rule out that as a possibility. I also ran the original Python script on the university linux cluster and it reproduced the expected result every time. Within the DiReCT filter I've isolated it to which variable is deviating from the expected value on my Mac but I haven't figured out why yet.

@stnava
Copy link
Member

stnava commented Mar 17, 2024

I also dont know what's happening and have seen this issue in my large-scale studies that are run on a linux cluster ... but my guess is that the issue comes down to something to do with (reliability of) precision for relatively small float values. if some of the very small values are shrunk ( lost in some sense ) then the algorithm would lead to overly large thickness estimates. I wonder if the "count" image ( or any other image used in the denominator ) should be stored as double precision.

@ntustison
Copy link
Member Author

@stnava , In this case I should've mentioned that my initial run on the UVa cluster showed some variation in output. After updating with @ncullen93 's updating of the tags, I was able to get the expected repeatability in output.

In reference to my comment to @cookpa , the two main image variables in the filter in constructing the thickness image, hitImage and totalImage, the hitImage is consistent through the different runs. It's the totalImage that seems to be the immediate cause of variation in output. I'll try using double precision for that image and see if that fixes the issue.

@stnava
Copy link
Member

stnava commented Mar 17, 2024

@ntustison thanks - good to know. re hit vs total I would have guessed the opposite effect. I wonder if any insight could be gained from writing out every state of any image involved in the total ... if all else fails.

@ntustison
Copy link
Member Author

Yeah, will do after I investigate the effect of changing RealType = float to RealType = double in KellyKapowski.cxx. It gets a little messy with the integration step but it would definitely narrow it down.

@ntustison
Copy link
Member Author

Okay, I'm pretty sure this was the issue.

These are always bittersweet---glad to have figured it out but it's so damn obvious in hindsight.

@cookpa
Copy link
Member

cookpa commented Mar 18, 2024

Re-opening, we will fix by updating ANTs version to incorporate updates from @ntustison

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants