Floating-point exception when concatenating IAEA phase space files with addphsp #423

mchamberland · 2018-04-14T18:46:32Z

I get a floating-point exception (IEEE_INVALID_FLAG) when concatenating IAEA phase space files. I re-compiled addphsp with the -ffpe-trap=invalid flag and no optimization to produce the backtrace below:

[~/Desktop/TrueBeam_v2_phsp_10FFF] Marc$ addphsp TrueBeam_v2_10FFF test 2 0 1 1

 Will sum from phsp file TrueBeam_v2_10FFF_w0.1.IAEAphsp
 to TrueBeam_v2_10FFF_w1.1.IAEAphsp
 And output result to test.1.IAEAphsp


 Adding TrueBeam_v2_10FFF_w0.1.IAEAphsp to test.1.IAEAphsp: 


 Header information for TrueBeam_v2_10FFF_w0.1.IAEAphsp:

  Warning: IAEA format phsp file does not store LATCH

            TOTAL NUMBER OF PARTICLES IN FILE:     45862111
                     TOTAL NUMBER OF PHOTONS:     45427146
THE REST ARE ELECTRONS/POSITRONS.
 
      MAXIMUM KINETIC ENERGY OF THE PARTICLES:       10.377 MeV
 # OF INCIDENT PARTICLES FROM ORIGINAL SOURCE:    324000000

                       Z AT WHICH PHSP SCORED:       26.700 cm



 Header information for test.1.IAEAphsp:


 First time writing to this file.
 No header data to display.


 BEGIN READING/WRITING PH-SP DATA .....


Program received signal SIGFPE: Floating-point exception - erroneous arithmetic operation.

Backtrace for this error:
#0  0x10280c092
#1  0x10280b3b0
#2  0x7fff6a4bdf59
#3  0x1027fe4e6
#4  0x1027f967b
#5  0x1027d733a
#6  0x1027daaa8
#7  0x1027dd3e8
Floating point exception: 8

I'm not sure if I can provide anything more than that since those are the Varian phase space files which I'm not allowed to share. Let me know if I can help debug.

The text was updated successfully, but these errors were encountered:

mchamberland · 2018-04-14T18:54:20Z

Slightly more informative backtrace:

Process 32052 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_ARITHMETIC (code=EXC_I386_SSEEXTERR, subcode=0x1f21)
    frame #0: 0x00000001000c84e6 libiaea_phsp.dylib`iaea_record_type::read_particle() + 614
libiaea_phsp.dylib`iaea_record_type::read_particle:
->  0x1000c84e6 <+614>: mulss  %xmm1, %xmm0
    0x1000c84ea <+618>: movss  0x39da(%rip), %xmm3       ; xmm3 = mem[0],zero,zero,zero 
    0x1000c84f2 <+626>: mulss  %xmm0, %xmm3
    0x1000c84f6 <+630>: mulss  %xmm1, %xmm0
Target 0: (addphsp) stopped.

Edit: and that’s as far as I get because my debugger refuses to print any variables.

mchamberland · 2018-04-15T18:25:49Z

A bit more of testing: the error does not occur when creating a fresh debug configuration of EGSnrc with all optimizations turned off. But I can't figure out exactly with which combination of optimization flags the error shows up.

mchamberland · 2018-04-16T21:39:03Z

I think I isolated the problem to, what else, the -ffast-math flag. Perhaps Iwan is right and it's time to move away from that pesky flag (see, e.g., #174).

Update the default gcc optimization configuration to use -march=native instead of -ffast-math. The latter causes various floating-point exceptions on newer cpus and compilers. If the programs are run on a different cpu, then one should use the corresponding -march option for that architecture instead of "native", or else use the less aggressive -mtune=native if the compiling and running cpus are in the same family.

Update the default gcc optimization configuration to -mtune=native instead of -ffast-math. The latter causes various floating-point exceptions on newer cpus and compilers. Note that if everything is compiled and run on identical cpu, then the more aggressive -march=native option should be considered during configuration.

crcrewso · 2018-06-20T17:16:41Z

Very happy to see the change from march to mtune!!!

As this code currently works on Arm32 as well I'm hesitant to suggest we go so far as to include an -march=nocona as that's the oldest x86_64 architecture, but for reasonable people should we consider it.

Update the default gcc optimization configuration to -mtune=native instead of -ffast-math. The latter causes various floating-point exceptions on newer cpus and compilers. Note that if everything is compiled and run on identical cpu, then the more aggressive -march=native option should be considered during configuration. Also add a test in the Fortran compiler version check to catch the gfortran version string.

Update the default gcc optimization configuration to -mtune=native instead of -ffast-math. The latter causes various floating-point exceptions on newer cpus and compilers. Note that if everything is compiled and run on identical cpu, then the more aggressive -march=native option should be considered during configuration. Also add a test in the Fortran compiler version check to catch the gfortran version string, and fix a duplicate echo for the default fortran debugger flag.

Update the default gcc optimization configuration to -mtune=native instead of -ffast-math. The latter causes various floating-point exceptions on newer cpus and compilers. Note that if everything is compiled and run on identical cpu, then the more aggressive -march=native option should be considered during configuration. Change the default optimization level to -O2 instead of -O3. There have been cases where upgrading to a newer compiler revealed bugs under -O3, and more aggressive optimization does not always lead to increased performance. The -O2 option is a better default, and another level can be selected at configuration time. Also add a test in the Fortran compiler version check to catch the gfortran version string, and fix a duplicate echo for the default fortran debugger flag.

ftessier mentioned this issue Apr 19, 2018

Fix #423: change -ffast-math to -mtune=native #432

Merged

blakewalters self-assigned this Apr 19, 2018

rtownson added the compilation label Jul 17, 2018

rtownson closed this as completed Oct 3, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Floating-point exception when concatenating IAEA phase space files with addphsp #423

Floating-point exception when concatenating IAEA phase space files with addphsp #423

mchamberland commented Apr 14, 2018

mchamberland commented Apr 14, 2018 •

edited

Loading

mchamberland commented Apr 15, 2018

mchamberland commented Apr 16, 2018

crcrewso commented Jun 20, 2018

Floating-point exception when concatenating IAEA phase space files with addphsp #423

Floating-point exception when concatenating IAEA phase space files with addphsp #423

Comments

mchamberland commented Apr 14, 2018

mchamberland commented Apr 14, 2018 • edited Loading

mchamberland commented Apr 15, 2018

mchamberland commented Apr 16, 2018

crcrewso commented Jun 20, 2018

mchamberland commented Apr 14, 2018 •

edited

Loading