Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

antsApplyTransforms hangs in ANTs-2.1.0rc2 #142

Closed
ccraddock opened this issue Jan 5, 2015 · 1 comment
Closed

antsApplyTransforms hangs in ANTs-2.1.0rc2 #142

ccraddock opened this issue Jan 5, 2015 · 1 comment

Comments

@ccraddock
Copy link

I am experiencing a problem where antsApplyTransforms is hanging in ANTs-2.1.0rc2. I running several scripts that call antsApplyTransforms using GNU parallel to achieve parallel execution on a multicore workstation (64 processors, 256 GB RAM). I am applying affine and warp transforms to heterogenous fMRI data, but on average the resolution is 3.5 mm x 3.5 mm x 4 mm x 300 TRs. Five minutes or so after starting the processing, I will notice at least one process that is consuming ~1.4GB of RAM and using 0% CPU. I am able to reproduce this problem fairly reliably. I am transforming 4,000 files and keep restarting to pick up the files that hung in the previous iteration.

Here is what I have been able to figure out, so far.

This is the information that is returned from ps:

63483  9.4  0.2 4678684 683964 pts/6  S+   22:01   1:27 antsApplyTransforms -d 3 -e 3 -i /tmp/tmp.J6PLI4JNa4/infile_smoothed.nii.gz -r /usr/share/fsl/5.0/data/standard/MNI152_T1_3mm_symmetric.nii.gz -o /mnt/3tb/abide_vmhc/Output_2014-06-14_symmetric/filt_global/0050126_symmetric_functional.nii.gz -n Linear -t /data/Projects/ABIDE_Initiative/Derivatives/CPAC/vmhc/Out/pipeline_symmetric/0050126_session_1/anatomical_to_mni_nonlinear_xfm/ants_Warp.nii.gz -t /data/Projects/ABIDE_Initiative/Derivatives/CPAC/vmhc/Out/pipeline_symmetric/0050126_session_1/ants_affine_xfm/ants_Affine.txt -t /tmp/tmp.J6PLI4JNa4/func_to_anat_affine.txt

Strace stats that the process is waiting for mutex

strace -p 63483
futex(0x31a9818, FUTEX_WAIT, 1, NULL

Next I attached the process with gdb

gdb -p 63483

Backtrace shows that the process is blocked in a pthread_join() call

(gdb) bt
#0  0x00007fb8a96e1148 in pthread_join (threadid=52073800, thread_return=0x0) at pthread_join.c:89
#1  0x000000000155407e in itk::MultiThreader::WaitForSingleMethodThread(unsigned long) ()
#2  0x00000000015554b5 in itk::MultiThreader::SingleMethodExecute() ()
#3  0x0000000000b831c9 in itk::ImageSource<itk::Image<double, 3u> >::GenerateData (this=0x31a94d0)
    at /opt/ants/ITKv4-install/include/ITK-4.6/itkImageSource.hxx:242
#4  0x000000000154c223 in itk::ProcessObject::UpdateOutputData(itk::DataObject*) ()
#5  0x0000000000a72906 in itk::ImageBase<3u>::UpdateOutputData (this=0x31b72a0)
    at /opt/ants/ITKv4-install/include/ITK-4.6/itkImageBase.hxx:287
#6  0x00000000009a3cc2 in ants::antsApplyTransforms<double, 3u> (parser=..., inputImageType=3)
    at src/ANTs-2.1.0rc2/Examples/antsApplyTransforms.cxx:367
#7  0x0000000000991168 in ants::antsApplyTransforms (args=std::vector of length 19, capacity 36 = {...})
    at src/ANTs-2.1.0rc2/Examples/antsApplyTransforms.cxx:977
#8  0x000000000098b9b1 in main (argc=19, argv=0x7fff12c7ebc8) at /opt/ants/ANTS-build/Examples/cli_antsApplyTransforms.cxx:11

But there only seems to a single thread

(gdb) info threads
  Id   Target Id         Frame 
* 1    Thread 0x7fb8a9b13740 (LWP 63483) "antsApplyTransf" 0x000000000155407e in itk::MultiThreader::WaitForSingleMethodThread(unsigned long) ()
(gdb) thread
[Current thread is 1 (Thread 0x7fb8a9b13740 (LWP 63483))]

I moved to the deepest frame in ITK for which I have symbols. I was able to recompile ANTs with debug symbols, but it didn't work for ITK. Do you have any suggestions for how to turn on DEBUG mode in ITK? I am compiling ITK through the ANTs build and not standalone.

(gdb) select-frame 3
(gdb) frame
#3  0x0000000000b831c9 in itk::ImageSource<itk::Image<double, 3u> >::GenerateData (this=0x31a94d0)
    at /opt/ants/ITKv4-install/include/ITK-4.6/itkImageSource.hxx:242
242   this->GetMultiThreader()->SingleMethodExecute();
(gdb) info locals
str = {Filter = {m_Pointer = 0x31a94d0}}
outputPtr = 0x31b72a0
splitter = 0x3164440
validThreads = 61

ITK is configuring itself to use up to 61 threads based on the number of processors on my system. I limit this value to 2 using the environment variable:

ITK_GLOBAL_DEFAULT_NUMBER_OF_THREADS=2

But still had the same problem.

I am also having a problem where antsApplyTransforms periodically dies with a segmentation fault. If I keep rerunning the script on the same data it will eventually work.

@stnava
Copy link
Member

stnava commented Jan 5, 2015

probably need

ITK_GLOBAL_DEFAULT_NUMBER_OF_THREADS=1

brian

On Sun, Jan 4, 2015 at 10:58 PM, Cameron Craddock notifications@github.com
wrote:

I am experiencing a problem where antsApplyTransforms is hanging in
ANTs-2.1.0rc2. I running several scripts that call antsApplyTransforms
using GNU parallel to achieve parallel execution on a multicore workstation
(64 processors, 256 GB RAM). I am applying affine and warp transforms to
heterogenous fMRI data, but on average the resolution is 3.5 mm x 3.5 mm x
4 mm x 300 TRs. Five minutes or so after starting the processing, I will
notice at least one process that is consuming ~1.4GB of RAM and using 0%
CPU. I am able to reproduce this problem fairly reliably. I am transforming
4,000 files and keep restarting to pick up the files that hung in the
previous iteration.

Here is what I have been able to figure out, so far.
This is the information that is returned from ps:

63483 9.4 0.2 4678684 683964 pts/6 S+ 22:01 1:27 antsApplyTransforms -d 3 -e 3 -i /tmp/tmp.J6PLI4JNa4/infile_smoothed.nii.gz -r /usr/share/fsl/5.0/data/standard/MNI152_T1_3mm_symmetric.nii.gz -o /mnt/3tb/abide_vmhc/Output_2014-06-14_symmetric/filt_global/0050126_symmetric_functional.nii.gz -n Linear -t /data/Projects/ABIDE_Initiative/Derivatives/CPAC/vmhc/Out/pipeline_symmetric/0050126_session_1/anatomical_to_mni_nonlinear_xfm/ants_Warp.nii.gz -t /data/Projects/ABIDE_Initiative/Derivatives/CPAC/vmhc/Out/pipeline_symmetric/0050126_session_1/ants_affine_xfm/ants_Affine.txt -t /tmp/tmp.J6PLI4JNa4/func_to_anat_affine.txt

Strace stats that the process is waiting for mutex

strace -p 63483
futex(0x31a9818, FUTEX_WAIT, 1, NULL

Next I attached the process with gdb

gdb -p 63483

Backtrace shows that the process is blocked in a pthread_join() call

(gdb) bt
#0 0x00007fb8a96e1148 in pthread_join (threadid=52073800, thread_return=0x0) at pthread_join.c:89
#1 0x000000000155407e in itk::MultiThreader::WaitForSingleMethodThread(unsigned long) ()
#2 0x00000000015554b5 in itk::MultiThreader::SingleMethodExecute() ()
#3 0x0000000000b831c9 in itk::ImageSource<itk::Image<double, 3u> >::GenerateData (this=0x31a94d0)
at /opt/ants/ITKv4-install/include/ITK-4.6/itkImageSource.hxx:242
#4 0x000000000154c223 in itk::ProcessObject::UpdateOutputData(itk::DataObject*) ()
#5 0x0000000000a72906 in itk::ImageBase<3u>::UpdateOutputData (this=0x31b72a0)
at /opt/ants/ITKv4-install/include/ITK-4.6/itkImageBase.hxx:287
#6 0x00000000009a3cc2 in ants::antsApplyTransforms<double, 3u> (parser=..., inputImageType=3)
at src/ANTs-2.1.0rc2/Examples/antsApplyTransforms.cxx:367
#7 0x0000000000991168 in ants::antsApplyTransforms (args=std::vector of length 19, capacity 36 = {...})
at src/ANTs-2.1.0rc2/Examples/antsApplyTransforms.cxx:977
#8 0x000000000098b9b1 in main (argc=19, argv=0x7fff12c7ebc8) at /opt/ants/ANTS-build/Examples/cli_antsApplyTransforms.cxx:11

But there only seems to a single thread

(gdb) info threads
Id Target Id Frame

  • 1 Thread 0x7fb8a9b13740 (LWP 63483) "antsApplyTransf" 0x000000000155407e in itk::MultiThreader::WaitForSingleMethodThread(unsigned long) ()
    (gdb) thread
    [Current thread is 1 (Thread 0x7fb8a9b13740 (LWP 63483))]

I moved to the deepest frame in ITK for which I have symbols. I was able
to recompile ANTs with debug symbols, but it didn't work for ITK. Do you
have any suggestions for how to turn on DEBUG mode in ITK? I am compiling
ITK through the ANTs build and not standalone.

(gdb) select-frame 3
(gdb) frame
#3 0x0000000000b831c9 in itk::ImageSource<itk::Image<double, 3u> >::GenerateData (this=0x31a94d0)
at /opt/ants/ITKv4-install/include/ITK-4.6/itkImageSource.hxx:242
242 this->GetMultiThreader()->SingleMethodExecute();
(gdb) info locals
str = {Filter = {m_Pointer = 0x31a94d0}}
outputPtr = 0x31b72a0
splitter = 0x3164440
validThreads = 61

ITK is configuring itself to use up to 61 threads based on the number of
processors on my system. I limit this value to 2 using the environment
variable:

ITK_GLOBAL_DEFAULT_NUMBER_OF_THREADS=2

But still had the same problem.

I am also having a problem where antsApplyTransforms periodically dies
with a segmentation fault. If I keep rerunning the script on the same data
it will eventually work.


Reply to this email directly or view it on GitHub
#142.

@stnava stnava closed this as completed Jan 15, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants